OpenSearch/docs/reference/slm/getting-started-slm.asciidoc

185 lines
9.5 KiB
Plaintext
Raw Normal View History

[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
[role="xpack"]
[testenv="basic"]
[[getting-started-snapshot-lifecycle-management]]
=== Tutorial: Automate backups with {slm-init}
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
This tutorial demonstrates how to automate daily backups of {es} data streams and indices using an {slm-init} policy.
The policy takes <<modules-snapshots, snapshots>> of all data streams and indices in the cluster
and stores them in a local repository.
It also defines a retention policy and automatically deletes snapshots
when they are no longer needed.
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
To manage snapshots with {slm-init}, you:
. <<slm-gs-register-repository, Register a repository>>.
. <<slm-gs-create-policy, Create an {slm-init} policy>>.
To test the policy, you can manually trigger it to take an initial snapshot.
[discrete]
[[slm-gs-register-repository]]
==== Register a repository
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
To use {slm-init}, you must have a snapshot repository configured.
The repository can be local (shared filesystem) or remote (cloud storage).
Remote repositories can reside on S3, HDFS, Azure, Google Cloud Storage,
or any other platform supported by a {plugins}/repository.html[repository plugin].
Remote repositories are generally used for production deployments.
For this tutorial, you can register a local repository from
{kibana-ref}/snapshot-repositories.html[{kib} Management]
or use the put repository API:
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
[source,console]
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
-----------------------------------
PUT /_snapshot/my_repository
{
"type": "fs",
"settings": {
"location": "my_backup_location"
}
}
-----------------------------------
[discrete]
[[slm-gs-create-policy]]
==== Set up a snapshot policy
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
Once you have a repository in place,
you can define an {slm-init} policy to take snapshots automatically.
The policy defines when to take snapshots, which data streams or indices should be included,
and what to name the snapshots.
A policy can also specify a <<slm-retention,retention policy>> and
automatically delete snapshots when they are no longer needed.
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
TIP: Don't be afraid to configure a policy that takes frequent snapshots.
Snapshots are incremental and make efficient use of storage.
You can define and manage policies through {kib} Management or with the put policy API.
For example, you could define a `nightly-snapshots` policy
to back up all of your data streams and indices daily at 2:30AM UTC.
A put policy request defines the policy configuration in JSON:
[source,console]
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
--------------------------------------------------
PUT /_slm/policy/nightly-snapshots
{
"schedule": "0 30 1 * * ?", <1>
"name": "<nightly-snap-{now/d}>", <2>
"repository": "my_repository", <3>
"config": { <4>
"indices": ["*"] <5>
Add retention to Snapshot Lifecycle Management (backport of #4… (#46506) * Add retention to Snapshot Lifecycle Management (#46407) This commit adds retention to the existing Snapshot Lifecycle Management feature (#38461) as described in #43663. This allows a user to configure SLM to automatically delete older snapshots based on a number of criteria. An example policy would look like: ``` PUT /_slm/policy/snapshot-every-day { "schedule": "0 30 2 * * ?", "name": "<production-snap-{now/d}>", "repository": "my-s3-repository", "config": { "indices": ["foo-*", "important"] }, // Newly configured retention options "retention": { // Snapshots should be deleted after 14 days "expire_after": "14d", // Keep a maximum of thirty snapshots "max_count": 30, // Keep a minimum of the four most recent snapshots "min_count": 4 } } ``` SLM Retention is run on a scheduled configurable with the `slm.retention_schedule` setting, which supports cron expressions. Deletions are run for a configurable time bounded by the `slm.retention_duration` setting, which defaults to 1 hour. Included in this work is a new SLM stats API endpoint available through ``` json GET /_slm/stats ``` That returns statistics about snapshot taken and deleted, as well as successful retention runs, failures, and the time spent deleting snapshots. #45362 has more information as well as an example of the output. These stats are also included when retrieving SLM policies via the API. * Add base framework for snapshot retention (#43605) * Add base framework for snapshot retention This adds a basic `SnapshotRetentionService` and `SnapshotRetentionTask` to start as the basis for SLM's retention implementation. Relates to #38461 * Remove extraneous 'public' * Use a local var instead of reading class var repeatedly * Add SnapshotRetentionConfiguration for retention configuration (#43777) * Add SnapshotRetentionConfiguration for retention configuration This commit adds the `SnapshotRetentionConfiguration` class and its HLRC counterpart to encapsulate the configuration for SLM retention. Currently only a single parameter is supported as an example (we still need to discuss the different options we want to support and their names) to keep the size of the PR down. It also does not yet include version serialization checks since the original SLM branch has not yet been merged. Relates to #43663 * Fix REST tests * Fix more documentation * Use Objects.equals to avoid NPE * Put `randomSnapshotLifecyclePolicy` in only one place * Occasionally return retention with no configuration * Implement SnapshotRetentionTask's snapshot filtering and delet… (#44764) * Implement SnapshotRetentionTask's snapshot filtering and deletion This commit implements the snapshot filtering and deletion for `SnapshotRetentionTask`. Currently only the expire-after age is used for determining whether a snapshot is eligible for deletion. Relates to #43663 * Fix deletes running on the wrong thread * Handle missing or null policy in snap metadata differently * Convert Tuple<String, List<SnapshotInfo>> to Map<String, List<SnapshotInfo>> * Use the `OriginSettingClient` to work with security, enhance logging * Prevent NPE in test by mocking Client * Allow empty/missing SLM retention configuration (#45018) Semi-related to #44465, this allows the `"retention"` configuration map to be missing. Relates to #43663 * Add min_count and max_count as SLM retention predicates (#44926) This adds the configuration options for `min_count` and `max_count` as well as the logic for determining whether a snapshot meets this criteria to SLM's retention feature. These options are optional and one, two, or all three can be specified in an SLM policy. Relates to #43663 * Time-bound deletion of snapshots in retention delete function (#45065) * Time-bound deletion of snapshots in retention delete function With a cluster that has a large number of snapshots, it's possible that snapshot deletion can take a very long time (especially since deletes currently have to happen in a serial fashion). To prevent snapshot deletion from taking forever in a cluster and blocking other operations, this commit adds a setting to allow configuring a maximum time to spend deletion snapshots during retention. This dynamic setting defaults to 1 hour and is best-effort, meaning that it doesn't hard stop a deletion at an hour mark, but ensures that once the time has passed, all subsequent deletions are deferred until the next retention cycle. Relates to #43663 * Wow snapshots suuuure can take a long time. * Use a LongSupplier instead of actually sleeping * Remove TestLogging annotation * Remove rate limiting * Add SLM metrics gathering and endpoint (#45362) * Add SLM metrics gathering and endpoint This commit adds the infrastructure to gather metrics about the different SLM actions that a cluster takes. These actions are stored in `SnapshotLifecycleStats` and perpetuated in cluster state. The stats stored include the number of snapshots taken, failed, deleted, the number of retention runs, as well as per-policy counts for snapshots taken, failed, and deleted. It also includes the amount of time spent deleting snapshots from SLM retention. This commit also adds an endpoint for retrieving all stats (further commits will expose this in the SLM get-policy API) that looks like: ``` GET /_slm/stats { "retention_runs" : 13, "retention_failed" : 0, "retention_timed_out" : 0, "retention_deletion_time" : "1.4s", "retention_deletion_time_millis" : 1404, "policy_metrics" : { "daily-snapshots2" : { "snapshots_taken" : 7, "snapshots_failed" : 0, "snapshots_deleted" : 6, "snapshot_deletion_failures" : 0 }, "daily-snapshots" : { "snapshots_taken" : 12, "snapshots_failed" : 0, "snapshots_deleted" : 12, "snapshot_deletion_failures" : 6 } }, "total_snapshots_taken" : 19, "total_snapshots_failed" : 0, "total_snapshots_deleted" : 18, "total_snapshot_deletion_failures" : 6 } ``` This does not yet include HLRC for this, as this commit is quite large on its own. That will be added in a subsequent commit. Relates to #43663 * Version qualify serialization * Initialize counters outside constructor * Use computeIfAbsent instead of being too verbose * Move part of XContent generation into subclass * Fix REST action for master merge * Unused import * Record history of SLM retention actions (#45513) This commit records the deletion of snapshots by the retention component of SLM into the SLM history index for the purposes of reviewing operations taken by SLM and alerting. * Retry SLM retention after currently running snapshot completes (#45802) * Retry SLM retention after currently running snapshot completes This commit adds a ClusterStateObserver to wait until the currently running snapshot is complete before proceeding with snapshot deletion. SLM retention waits for the maximum allowed deletion time for the snapshot to complete, however, the waiting time is not factored into the limit on actual deletions. Relates to #43663 * Increase timeout waiting for snapshot completion * Apply patch From https://github.com/original-brownbear/elasticsearch/commit/2374316f0d1912c9e1498bece195546a1dc60bce.patch * Rename test variables * [TEST] Be less strict for stats checking * Skip SLM retention if ILM is STOPPING or STOPPED (#45869) This adds a check to ensure we take no action during SLM retention if ILM is currently stopped or in the process of stopping. Relates to #43663 * Check all actions preventing snapshot delete during retention (#45992) * Check all actions preventing snapshot delete during retention run Previously we only checked to see if a snapshot was currently running, but it turns out that more things can block snapshot deletion. This changes the check to be a check for: - a snapshot currently running - a deletion already in progress - a repo cleanup in progress - a restore currently running This was found by CI where a third party delete in a test caused SLM retention deletion to throw an exception. Relates to #43663 * Add unit test for okayToDeleteSnapshots * Fix bug where SLM retention task would be scheduled on every node * Enhance test logging * Ignore if snapshot is already deleted * Missing import * Fix SnapshotRetentionServiceTests * Expose SLM policy stats in get SLM policy API (#45989) This also adds support for the SLM stats endpoint to the high level rest client. Retrieving a policy now looks like: ```json { "daily-snapshots" : { "version": 1, "modified_date": "2019-04-23T01:30:00.000Z", "modified_date_millis": 1556048137314, "policy" : { "schedule": "0 30 1 * * ?", "name": "<daily-snap-{now/d}>", "repository": "my_repository", "config": { "indices": ["data-*", "important"], "ignore_unavailable": false, "include_global_state": false }, "retention": {} }, "stats": { "snapshots_taken": 0, "snapshots_failed": 0, "snapshots_deleted": 0, "snapshot_deletion_failures": 0 }, "next_execution": "2019-04-24T01:30:00.000Z", "next_execution_millis": 1556048160000 } } ``` Relates to #43663 * Rewrite SnapshotLifecycleIT as as ESIntegTestCase (#46356) * Rewrite SnapshotLifecycleIT as as ESIntegTestCase This commit splits `SnapshotLifecycleIT` into two different tests. `SnapshotLifecycleRestIT` which includes the tests that do not require slow repositories, and `SLMSnapshotBlockingIntegTests` which is now an integration test using `MockRepository` to simulate a snapshot being in progress. Relates to #43663 Resolves #46205 * Add error logging when exceptions are thrown * Update serialization versions * Fix type inference * Use non-Cancellable HLRC return value * Fix Client mocking in test * Fix SLMSnapshotBlockingIntegTests for 7.x branch * Update SnapshotRetentionTask for non-multi-repo snapshot retrieval * Add serialization guards for SnapshotLifecyclePolicy
2019-09-10 11:08:09 -04:00
},
"retention": { <6>
"expire_after": "30d", <7>
"min_count": 5, <8>
"max_count": 50 <9>
}
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
}
--------------------------------------------------
// TEST[continued]
<1> When the snapshot should be taken in
<<schedule-cron,Cron syntax>>: daily at 2:30AM UTC
<2> How to name the snapshot: use
<<date-math-index-names,date math>> to include the current date in the snapshot name
<3> Where to store the snapshot
<4> The configuration to be used for the snapshot requests (see below)
<5> Which data streams or indices to include in the snapshot: all data streams and indices
<6> Optional retention policy: keep snapshots for 30 days,
retaining at least 5 and no more than 50 snapshots regardless of age
You can specify additional snapshot configuration options to customize how snapshots are taken.
For example, you could configure the policy to fail the snapshot
if one of the specified data streams or indices is missing.
For more information about snapshot options, see <<snapshots-take-snapshot,snapshot requests>>.
[discrete]
[[slm-gs-test-policy]]
==== Test the snapshot policy
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
A snapshot taken by {slm-init} is just like any other snapshot.
You can view information about snapshots in {kib} Management or
get info with the <<snapshots-monitor-snapshot-restore, snapshot APIs>>.
In addition, {slm-init} keeps track of policy successes and failures so you
have insight into how the policy is working. If the policy has executed at
least once, the <<slm-api-get-policy, get policy>> API returns additional metadata
that shows if the snapshot succeeded.
You can manually execute a snapshot policy to take a snapshot immediately.
This is useful for taking snapshots before making a configuration change,
upgrading, or to test a new policy.
Manually executing a policy does not affect its configured schedule.
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
Instead of waiting for the policy to run, tell {slm-init} to take a snapshot
using the configuration right now instead of waiting for 1:30 a.m..
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
[source,console]
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
--------------------------------------------------
POST /_slm/policy/nightly-snapshots/_execute
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
--------------------------------------------------
// TEST[skip:we can't easily handle snapshots from docs tests]
After forcing the `nightly-snapshots` policy to run,
you can retrieve the policy to get success or failure information.
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
[source,console]
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
--------------------------------------------------
GET /_slm/policy/nightly-snapshots?human
--------------------------------------------------
// TEST[continued]
Only the most recent success and failure are returned,
but all policy executions are recorded in the `.slm-history*` indices.
The response also shows when the policy is scheduled to execute next.
NOTE: The response shows if the policy succeeded in _initiating_ a snapshot.
However, that does not guarantee that the snapshot completed successfully.
It is possible for the initiated snapshot to fail if, for example, the connection to a remote
repository is lost while copying files.
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
[source,console-result]
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
--------------------------------------------------
{
"nightly-snapshots" : {
"version": 1,
"modified_date": "2019-04-23T01:30:00.000Z",
"modified_date_millis": 1556048137314,
"policy" : {
"schedule": "0 30 1 * * ?",
"name": "<nightly-snap-{now/d}>",
"repository": "my_repository",
"config": {
"indices": ["*"],
Add retention to Snapshot Lifecycle Management (backport of #4… (#46506) * Add retention to Snapshot Lifecycle Management (#46407) This commit adds retention to the existing Snapshot Lifecycle Management feature (#38461) as described in #43663. This allows a user to configure SLM to automatically delete older snapshots based on a number of criteria. An example policy would look like: ``` PUT /_slm/policy/snapshot-every-day { "schedule": "0 30 2 * * ?", "name": "<production-snap-{now/d}>", "repository": "my-s3-repository", "config": { "indices": ["foo-*", "important"] }, // Newly configured retention options "retention": { // Snapshots should be deleted after 14 days "expire_after": "14d", // Keep a maximum of thirty snapshots "max_count": 30, // Keep a minimum of the four most recent snapshots "min_count": 4 } } ``` SLM Retention is run on a scheduled configurable with the `slm.retention_schedule` setting, which supports cron expressions. Deletions are run for a configurable time bounded by the `slm.retention_duration` setting, which defaults to 1 hour. Included in this work is a new SLM stats API endpoint available through ``` json GET /_slm/stats ``` That returns statistics about snapshot taken and deleted, as well as successful retention runs, failures, and the time spent deleting snapshots. #45362 has more information as well as an example of the output. These stats are also included when retrieving SLM policies via the API. * Add base framework for snapshot retention (#43605) * Add base framework for snapshot retention This adds a basic `SnapshotRetentionService` and `SnapshotRetentionTask` to start as the basis for SLM's retention implementation. Relates to #38461 * Remove extraneous 'public' * Use a local var instead of reading class var repeatedly * Add SnapshotRetentionConfiguration for retention configuration (#43777) * Add SnapshotRetentionConfiguration for retention configuration This commit adds the `SnapshotRetentionConfiguration` class and its HLRC counterpart to encapsulate the configuration for SLM retention. Currently only a single parameter is supported as an example (we still need to discuss the different options we want to support and their names) to keep the size of the PR down. It also does not yet include version serialization checks since the original SLM branch has not yet been merged. Relates to #43663 * Fix REST tests * Fix more documentation * Use Objects.equals to avoid NPE * Put `randomSnapshotLifecyclePolicy` in only one place * Occasionally return retention with no configuration * Implement SnapshotRetentionTask's snapshot filtering and delet… (#44764) * Implement SnapshotRetentionTask's snapshot filtering and deletion This commit implements the snapshot filtering and deletion for `SnapshotRetentionTask`. Currently only the expire-after age is used for determining whether a snapshot is eligible for deletion. Relates to #43663 * Fix deletes running on the wrong thread * Handle missing or null policy in snap metadata differently * Convert Tuple<String, List<SnapshotInfo>> to Map<String, List<SnapshotInfo>> * Use the `OriginSettingClient` to work with security, enhance logging * Prevent NPE in test by mocking Client * Allow empty/missing SLM retention configuration (#45018) Semi-related to #44465, this allows the `"retention"` configuration map to be missing. Relates to #43663 * Add min_count and max_count as SLM retention predicates (#44926) This adds the configuration options for `min_count` and `max_count` as well as the logic for determining whether a snapshot meets this criteria to SLM's retention feature. These options are optional and one, two, or all three can be specified in an SLM policy. Relates to #43663 * Time-bound deletion of snapshots in retention delete function (#45065) * Time-bound deletion of snapshots in retention delete function With a cluster that has a large number of snapshots, it's possible that snapshot deletion can take a very long time (especially since deletes currently have to happen in a serial fashion). To prevent snapshot deletion from taking forever in a cluster and blocking other operations, this commit adds a setting to allow configuring a maximum time to spend deletion snapshots during retention. This dynamic setting defaults to 1 hour and is best-effort, meaning that it doesn't hard stop a deletion at an hour mark, but ensures that once the time has passed, all subsequent deletions are deferred until the next retention cycle. Relates to #43663 * Wow snapshots suuuure can take a long time. * Use a LongSupplier instead of actually sleeping * Remove TestLogging annotation * Remove rate limiting * Add SLM metrics gathering and endpoint (#45362) * Add SLM metrics gathering and endpoint This commit adds the infrastructure to gather metrics about the different SLM actions that a cluster takes. These actions are stored in `SnapshotLifecycleStats` and perpetuated in cluster state. The stats stored include the number of snapshots taken, failed, deleted, the number of retention runs, as well as per-policy counts for snapshots taken, failed, and deleted. It also includes the amount of time spent deleting snapshots from SLM retention. This commit also adds an endpoint for retrieving all stats (further commits will expose this in the SLM get-policy API) that looks like: ``` GET /_slm/stats { "retention_runs" : 13, "retention_failed" : 0, "retention_timed_out" : 0, "retention_deletion_time" : "1.4s", "retention_deletion_time_millis" : 1404, "policy_metrics" : { "daily-snapshots2" : { "snapshots_taken" : 7, "snapshots_failed" : 0, "snapshots_deleted" : 6, "snapshot_deletion_failures" : 0 }, "daily-snapshots" : { "snapshots_taken" : 12, "snapshots_failed" : 0, "snapshots_deleted" : 12, "snapshot_deletion_failures" : 6 } }, "total_snapshots_taken" : 19, "total_snapshots_failed" : 0, "total_snapshots_deleted" : 18, "total_snapshot_deletion_failures" : 6 } ``` This does not yet include HLRC for this, as this commit is quite large on its own. That will be added in a subsequent commit. Relates to #43663 * Version qualify serialization * Initialize counters outside constructor * Use computeIfAbsent instead of being too verbose * Move part of XContent generation into subclass * Fix REST action for master merge * Unused import * Record history of SLM retention actions (#45513) This commit records the deletion of snapshots by the retention component of SLM into the SLM history index for the purposes of reviewing operations taken by SLM and alerting. * Retry SLM retention after currently running snapshot completes (#45802) * Retry SLM retention after currently running snapshot completes This commit adds a ClusterStateObserver to wait until the currently running snapshot is complete before proceeding with snapshot deletion. SLM retention waits for the maximum allowed deletion time for the snapshot to complete, however, the waiting time is not factored into the limit on actual deletions. Relates to #43663 * Increase timeout waiting for snapshot completion * Apply patch From https://github.com/original-brownbear/elasticsearch/commit/2374316f0d1912c9e1498bece195546a1dc60bce.patch * Rename test variables * [TEST] Be less strict for stats checking * Skip SLM retention if ILM is STOPPING or STOPPED (#45869) This adds a check to ensure we take no action during SLM retention if ILM is currently stopped or in the process of stopping. Relates to #43663 * Check all actions preventing snapshot delete during retention (#45992) * Check all actions preventing snapshot delete during retention run Previously we only checked to see if a snapshot was currently running, but it turns out that more things can block snapshot deletion. This changes the check to be a check for: - a snapshot currently running - a deletion already in progress - a repo cleanup in progress - a restore currently running This was found by CI where a third party delete in a test caused SLM retention deletion to throw an exception. Relates to #43663 * Add unit test for okayToDeleteSnapshots * Fix bug where SLM retention task would be scheduled on every node * Enhance test logging * Ignore if snapshot is already deleted * Missing import * Fix SnapshotRetentionServiceTests * Expose SLM policy stats in get SLM policy API (#45989) This also adds support for the SLM stats endpoint to the high level rest client. Retrieving a policy now looks like: ```json { "daily-snapshots" : { "version": 1, "modified_date": "2019-04-23T01:30:00.000Z", "modified_date_millis": 1556048137314, "policy" : { "schedule": "0 30 1 * * ?", "name": "<daily-snap-{now/d}>", "repository": "my_repository", "config": { "indices": ["data-*", "important"], "ignore_unavailable": false, "include_global_state": false }, "retention": {} }, "stats": { "snapshots_taken": 0, "snapshots_failed": 0, "snapshots_deleted": 0, "snapshot_deletion_failures": 0 }, "next_execution": "2019-04-24T01:30:00.000Z", "next_execution_millis": 1556048160000 } } ``` Relates to #43663 * Rewrite SnapshotLifecycleIT as as ESIntegTestCase (#46356) * Rewrite SnapshotLifecycleIT as as ESIntegTestCase This commit splits `SnapshotLifecycleIT` into two different tests. `SnapshotLifecycleRestIT` which includes the tests that do not require slow repositories, and `SLMSnapshotBlockingIntegTests` which is now an integration test using `MockRepository` to simulate a snapshot being in progress. Relates to #43663 Resolves #46205 * Add error logging when exceptions are thrown * Update serialization versions * Fix type inference * Use non-Cancellable HLRC return value * Fix Client mocking in test * Fix SLMSnapshotBlockingIntegTests for 7.x branch * Update SnapshotRetentionTask for non-multi-repo snapshot retrieval * Add serialization guards for SnapshotLifecyclePolicy
2019-09-10 11:08:09 -04:00
},
"retention": {
"expire_after": "30d",
"min_count": 5,
"max_count": 50
}
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
},
"last_success": { <1>
"snapshot_name": "nightly-snap-2019.04.24-tmtnyjtrsxkhbrrdcgg18a", <2>
"time_string": "2019-04-24T16:43:49.316Z",
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
"time": 1556124229316
} ,
"last_failure": { <3>
"snapshot_name": "nightly-snap-2019.04.02-lohisb5ith2n8hxacaq3mw",
"time_string": "2019-04-02T01:30:00.000Z",
"time": 1556042030000,
"details": "{\"type\":\"index_not_found_exception\",\"reason\":\"no such index [important]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\"important\",\"index_uuid\":\"_na_\",\"index\":\"important\",\"stack_trace\":\"[important] IndexNotFoundException[no such index [important]]\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:762)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:714)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:670)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:163)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:142)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:102)\\n\\tat org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:280)\\n\\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\\n\\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:687)\\n\\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:310)\\n\\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:210)\\n\\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\\n\\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\\n\\tat java.base/java.lang.Thread.run(Thread.java:834)\\n\"}"
} ,
"next_execution": "2019-04-24T01:30:00.000Z", <4>
"next_execution_millis": 1556048160000
[7.x] Add Snapshot Lifecycle Management (#44382) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It *does* record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "*/30 * * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 09:37:13 -04:00
}
}
--------------------------------------------------
// TESTRESPONSE[skip:the presence of last_failure and last_success is asynchronous and will be present for users, but is untestable]
<1> Information about the last time the policy successfully initated a snapshot
<2> The name of the snapshot that was successfully initiated
<3> Information about the last time the policy failed to initiate a snapshot
<4> The next time the policy will execute