Commit Graph

105 Commits

Author SHA1 Message Date
Armin Braun 6aaee8aa0a
Repository Cleanup Endpoint (#43900) (#45780)
* Repository Cleanup Endpoint (#43900)

* Snapshot cleanup functionality via transport/REST endpoint.
* Added all the infrastructure for this with the HLRC and node client
* Made use of it in tests and resolved relevant TODO
* Added new `Custom` CS element that tracks the cleanup logic.
Kept it similar to the delete and in progress classes and gave it
some (for now) redundant way of handling multiple cleanups but only allow one
* Use the exact same mechanism used by deletes to have the combination
of CS entry and increment in repository state ID provide some
concurrency safety (the initial approach of just an entry in the CS
was not enough, we must increment the repository state ID to be safe
against concurrent modifications, otherwise we run the risk of "cleaning up"
blobs that just got created without noticing)
* Isolated the logic to the transport action class as much as I could.
It's not ideal, but we don't need to keep any state and do the same
for other repository operations
(like getting the detailed snapshot shard status)
2019-08-21 17:59:49 +02:00
Chris Dean deab736aad
[DOCS] - Updating chunk_size values to fix size value notation. Chunksize41591 (#45552) (#45579)
* changes to chunk_size #41591

* update to chunk size to include ` `

* Update docs/plugins/repository-azure.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/modules/snapshots.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

* Update docs/plugins/repository-azure.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

* Update docs/plugins/repository-s3.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

* edits to fix passive voice
2019-08-14 15:59:36 -05:00
Chris Dean caa2a7738f Revert "[DOCS] - Updating chunk_size values to fix size value notation. Chunksize41591 (#45552)"
This reverts commit 8fdbcd7395.
2019-08-14 15:14:10 -05:00
Chris Dean 8fdbcd7395 [DOCS] - Updating chunk_size values to fix size value notation. Chunksize41591 (#45552)
* changes to chunk_size #41591

* update to chunk size to include ` `

* Update docs/plugins/repository-azure.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/modules/snapshots.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

* Update docs/plugins/repository-azure.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

* Update docs/plugins/repository-s3.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

* edits to fix passive voice
2019-08-14 14:15:22 -05:00
Chris Dean 82d48cfcc9
[DOCS] Added cross-link to snapshot lifecycle management. Closes #44588. (#45408) (#45468)
merging #44588 changes into 7.x
2019-08-12 15:13:11 -05:00
James Rodewig 76c7e3a05f [DOCS] Replace `_meta` with `metadata` for snapshot APIs. (#44596)
elastic/elasticsearch#41281 added custom metadata parameter to
snapshots. During review, the parameter name was changed from '_meta' to
'metadata,' but the documentation wasn't updated. This corrects the
documentation to use the 'metadata' name.
2019-07-19 08:40:57 -04:00
Lee Hinman fb0461ac76
[7.x] Add Snapshot Lifecycle Management (#44382)
* Add Snapshot Lifecycle Management (#43934)

* Add SnapshotLifecycleService and related CRUD APIs

This commit adds `SnapshotLifecycleService` as a new service under the ilm
plugin. This service handles snapshot lifecycle policies by scheduling based on
the policies defined schedule.

This also includes the get, put, and delete APIs for these policies

Relates to #38461

* Make scheduledJobIds return an immutable set

* Use Object.equals for SnapshotLifecyclePolicy

* Remove unneeded TODO

* Implement ToXContentFragment on SnapshotLifecyclePolicyItem

* Copy contents of the scheduledJobIds

* Handle snapshot lifecycle policy updates and deletions (#40062)

(Note this is a PR against the `snapshot-lifecycle-management` feature branch)

This adds logic to `SnapshotLifecycleService` to handle updates and deletes for
snapshot policies. Policies with incremented versions have the old policy
cancelled and the new one scheduled. Deleted policies have their schedules
cancelled when they are no longer present in the cluster state metadata.

Relates to #38461

* Take a snapshot for the policy when the SLM policy is triggered (#40383)

(This is a PR for the `snapshot-lifecycle-management` branch)

This commit fills in `SnapshotLifecycleTask` to actually perform the
snapshotting when the policy is triggered. Currently there is no handling of the
results (other than logging) as that will be added in subsequent work.

This also adds unit tests and an integration test that schedules a policy and
ensures that a snapshot is correctly taken.

Relates to #38461

* Record most recent snapshot policy success/failure (#40619)

Keeping a record of the results of the successes and failures will aid
troubleshooting of policies and make users more confident that their
snapshots are being taken as expected.

This is the first step toward writing history in a more permanent
fashion.

* Validate snapshot lifecycle policies (#40654)

(This is a PR against the `snapshot-lifecycle-management` branch)

With the commit, we now validate the content of snapshot lifecycle policies when
the policy is being created or updated. This checks for the validity of the id,
name, schedule, and repository. Additionally, cluster state is checked to ensure
that the repository exists prior to the lifecycle being added to the cluster
state.

Part of #38461

* Hook SLM into ILM's start and stop APIs (#40871)

(This pull request is for the `snapshot-lifecycle-management` branch)

This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also
manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are
cancelled.

Relates to #38461

* Add tests for SnapshotLifecyclePolicyItem (#40912)

Adds serialization tests for SnapshotLifecyclePolicyItem.

* Fix improper import in build.gradle after master merge

* Add human readable version of modified date for snapshot lifecycle policy (#41035)

* Add human readable version of modified date for snapshot lifecycle policy

This small change changes it from:

```
...
"modified_date": 1554843903242,
...
```

To

```
...
"modified_date" : "2019-04-09T21:05:03.242Z",
"modified_date_millis" : 1554843903242,
...
```

Including the `"modified_date"` field when the `?human` field is used.

Relates to #38461

* Fix test

* Add API to execute SLM policy on demand (#41038)

This commit adds the ability to perform a snapshot on demand for a policy. This
can be useful to take a snapshot immediately prior to performing some sort of
maintenance.

```json
PUT /_ilm/snapshot/<policy>/_execute
```

And it returns the response with the generated snapshot name:

```json
{
  "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug"
}
```

Note that this does not allow waiting for the snapshot, and the snapshot could
still fail. It *does* record this information into the cluster state similar to
a regularly trigged SLM job.

Relates to #38461

* Add next_execution to SLM policy metadata (#41221)

* Add next_execution to SLM policy metadata

This adds the next time a snapshot lifecycle policy will be executed when
retriving a policy's metadata, for example:

```json
GET /_ilm/snapshot?human
{
  "production" : {
    "version" : 1,
    "modified_date" : "2019-04-15T21:16:21.865Z",
    "modified_date_millis" : 1555362981865,
    "policy" : {
      "name" : "<production-snap-{now/d}>",
      "schedule" : "*/30 * * * * ?",
      "repository" : "repo",
      "config" : {
        "indices" : [
          "foo-*",
          "important"
        ],
        "ignore_unavailable" : true,
        "include_global_state" : false
      }
    },
    "next_execution" : "2019-04-15T21:16:30.000Z",
    "next_execution_millis" : 1555362990000
  },
  "other" : {
    "version" : 1,
    "modified_date" : "2019-04-15T21:12:19.959Z",
    "modified_date_millis" : 1555362739959,
    "policy" : {
      "name" : "<other-snap-{now/d}>",
      "schedule" : "0 30 2 * * ?",
      "repository" : "repo",
      "config" : {
        "indices" : [
          "other"
        ],
        "ignore_unavailable" : false,
        "include_global_state" : true
      }
    },
    "next_execution" : "2019-04-16T02:30:00.000Z",
    "next_execution_millis" : 1555381800000
  }
}
```

Relates to #38461

* Fix and enhance tests

* Figured out how to Cron

* Change SLM endpoint from /_ilm/* to /_slm/* (#41320)

This commit changes the endpoint for snapshot lifecycle management from:

```
GET /_ilm/snapshot/<policy>
```

to:

```
GET /_slm/policy/<policy>
```

It mimics the ILM path only using `slm` instead of `ilm`.

Relates to #38461

* Add initial documentation for SLM (#41510)

* Add initial documentation for SLM

This adds the initial documentation for snapshot lifecycle management.

It also includes the REST spec API json files since they're sort of
documentation.

Relates to #38461

* Add `manage_slm` and `read_slm` roles (#41607)

* Add `manage_slm` and `read_slm` roles

This adds two more built in roles -

`manage_slm` which has permission to perform any of the SLM actions, as well as
stopping, starting, and retrieving the operation status of ILM.

`read_slm` which has permission to retrieve snapshot lifecycle policies as well
as retrieving the operation status of ILM.

Relates to #38461

* Add execute to the test

* Fix ilm -> slm typo in test

* Record SLM history into an index (#41707)

It is useful to have a record of the actions that Snapshot Lifecycle
Management takes, especially for the purposes of alerting when a
snapshot fails or has not been taken successfully for a certain amount of
time.

This adds the infrastructure to record SLM actions into an index that
can be queried at leisure, along with a lifecycle policy so that this
history does not grow without bound.

Additionally,
SLM automatically setting up an index + lifecycle policy leads to
`index_lifecycle` custom metadata in the cluster state, which some of
the ML tests don't know how to deal with due to setting up custom
`NamedXContentRegistry`s.  Watcher would cause the same problem, but it
is already disabled (for the same reason).

* High Level Rest Client support for SLM (#41767)

* High Level Rest Client support for SLM

This commit add HLRC support for SLM.

Relates to #38461

* Fill out documentation tests with tags

* Add more callouts and asciidoc for HLRC

* Update javadoc links to real locations

* Add security test testing SLM cluster privileges (#42678)

* Add security test testing SLM cluster privileges

This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm`
cluster privileges.

Relates to #38461

* Don't redefine vars

*  Add Getting Started Guide for SLM  (#42878)

This commit adds a basic Getting Started Guide for SLM.

* Include SLM policy name in Snapshot metadata (#43132)

Keep track of which SLM policy in the metadata field of the Snapshots
taken by SLM. This allows users to more easily understand where the
snapshot came from, and will enable future SLM features such as
retention policies.

* Fix compilation after master merge

* [TEST] Move exception wrapping for devious exception throwing

Fixes an issue where an exception was created from one line and thrown in another.

* Fix SLM for the change to AcknowledgedResponse

* Add Snapshot Lifecycle Management Package Docs (#43535)

* Fix compilation for transport actions now that task is required

* Add a note mentioning the privileges needed for SLM (#43708)

* Add a note mentioning the privileges needed for SLM

This adds a note to the top of the "getting started with SLM"
documentation mentioning that there are two built-in privileges to
assist with creating roles for SLM users and administrators.

Relates to #38461

* Mention that you can create snapshots for indices you can't read

* Fix REST tests for new number of cluster privileges

* Mute testThatNonExistingTemplatesAreAddedImmediately (#43951)

* Fix SnapshotHistoryStoreTests after merge

* Remove overridden newResponse functions that have been removed

* Fix compilation for backport

* Fix get snapshot output parsing in test

* [DOCS] Add redirects for removed autogen anchors (#44380)

* Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 07:37:13 -06:00
Albert Zaharovits 018d946bba [DOC] Backup & Restore Security Configuration (#42970)
This commit documents the backup and restore of a cluster's
security configuration.

It is not possible to only backup (or only restore) security
configuration, independent to the rest of the cluster's conf,
so this describes how a full configuration backup&restore
will include security as well. Moreover, it explains how part
of the security conf data resides on the special .security
index and how to backup that using regular data snapshot API.

Co-Authored-By: Lisa Cawley <lcawley@elastic.co>
Co-Authored-By: Tim Vernum <tim@adjective.org>
2019-07-10 14:53:56 +03:00
Lisa Cawley 8ffd9c6981 [DOCS] Adds administering section (#43493) 2019-06-24 10:15:25 -07:00
Gordon Brown 6eb4600e93
Add custom metadata to snapshots (#41281)
Adds a metadata field to snapshots which can be used to store arbitrary
key-value information. This may be useful for attaching a description of
why a snapshot was taken, tagging snapshots to make categorization
easier, or identifying the source of automatically-created snapshots.
2019-06-05 17:30:31 -06:00
James Rodewig 53702efddd [DOCS] Add anchors for Asciidoctor migration (#41648) 2019-04-30 10:20:17 -04:00
Alex Doerr 740020dfe8 Clarify version compatibility in snapshot/restore docs (#39329) 2019-03-19 14:43:38 +01:00
Henning Andersen 95e61d4bb1
Blob Store compress default false (#40054)
Fixed documentation to comply with code (compress=false
is default until 8.0).
2019-03-15 12:31:03 +01:00
David Turner 1a23417aeb
[Zen2] Update documentation for Zen2 (#34714)
This commit overhauls the documentation of discovery and cluster coordination,
removing mention of the Zen Discovery module and replacing it with docs for the
new cluster coordination mechanism introduced in 7.0.

Relates #32006
2018-12-20 13:02:44 +00:00
Alexander Zhukov 842809ef37 Doc: Drop an extra 'a' in snapshots.asciidoc (#35251) 2018-11-05 13:31:35 -05:00
David Turner c9765d5fb9
Emphasize that filesystem-level backups don't work (#33102)
It is not obvious that a filesystem-level backup may capture an inconsistent
set of files that may fail on restore, or (worse) succeed having silently
discarded some data. This change spells the out, and reorganises the first page
or so of the snapshot/restore docs to make this warning fit more nicely.
2018-09-19 08:36:03 +01:00
Simon Willnauer c783488e97
Add `_source`-only snapshot repository (#32844)
This change adds a `_source` only snapshot repository that allows to wrap
any existing repository as a _backend_ to snapshot only the `_source` part
including live docs markers. Snapshots taken with the `source` repository
won't include any indices,  doc-values or points. The snapshot will be reduced in size and
functionality such that it requires full re-indexing after it's successfully restored.

The restore process will copy the `_source` data locally starts a special shard and engine
to allow `match_all` scrolls and searches. Any other query, or get call will fail with and unsupported operation exception.  The restored index is also marked as read-only.

This feature aims mainly for disaster recovery use-cases where snapshot size is
a concern or where time to restore is less of an issue.

**NOTE**: The snapshot produced by this repository is still a valid lucene index. This change doesn't allow for any longer retention policies which is out of scope for this change.
2018-09-12 17:47:10 +02:00
Vladimir Dolzhenko b55b079a90
Include size of snapshot in snapshot metadata #18543, bwc clean up (#30890) 2018-05-26 21:20:44 +02:00
Vladimir Dolzhenko 81eb8ba0f0
Include size of snapshot in snapshot metadata (#29602)
Include size of snapshot in snapshot metadata

Adds difference of number of files (and file sizes) between prev and current snapshot. Total number/size reflects total number/size of files in snapshot.

Closes #18543
2018-05-25 21:04:50 +02:00
Tanguy Leroux c351b51ac4
[Docs] Fix inconsistencies in snapshot/restore doc (#30480)
Closes #30444
2018-05-22 09:19:07 +02:00
Vladimir Dolzhenko fe3e0257ae
Allow date math for naming newly-created snapshots (#7939) (#30479)
Allow date math for naming newly-created snapshots (#7939)
2018-05-16 07:23:25 +02:00
Tanguy Leroux 63148dd9ba
Fail snapshot operations early on repository corruption (#30140)
A NullPointerException is thrown when trying to create or delete
a snapshot in a repository that has been written to by an older 
Elasticsearch after writing to it with a newer Elasticsearch version.

This is because the way snapshots are formatted in the repository 
snapshots index file changed in #24477.

This commit changes the parsing of the repository index file so that 
it now detects a corrupted index file and fails early the snapshot 
operation.

closes #29052
2018-04-27 16:29:59 +02:00
Yannick Welsch 3b8a8867c4
[DOCS] Unregister repository instead of deleting it (#29206)
Relates to #15426
2018-03-23 15:53:36 +01:00
Christoph Büscher 0d11b9fe34
[Docs] Unify spelling of Elasticsearch (#27567)
Removes occurences of "elasticsearch" or "ElasticSearch" in favour of
"Elasticsearch" where appropriate.
2017-11-29 09:44:25 +01:00
Shubham Aggarwal 5a925cd40c Fixed references to Multi Index Syntax (#27283) 2017-11-06 19:15:36 +01:00
Igor Motov d14486bce6
Docs: restore now fails if it encounters incompatible settings (#26933)
This change was introduced in 5.0.0, but the documentation wasn't updated to reflect it.

Closes #26453
2017-10-31 20:04:00 -04:00
Deb Adair b57cb83567 [DOCS] Added info about snapshotting your data before an upgrade. 2017-10-06 12:14:26 -07:00
Ali Beyad 743217a430 Enhances get snapshots API to allow retrieving repository index only (#24477)
Currently, the get snapshots API (e.g. /_snapshot/{repositoryName}/_all)
provides information about snapshots in the repository, including the
snapshot state, number of shards snapshotted, failures, etc.  In order
to provide information about each snapshot in the repository, the call
must read the snapshot metadata blob (`snap-{snapshot_uuid}.dat`) for
every snapshot.  In cloud-based repositories, this can be expensive,
both from a cost and performance perspective.  Sometimes, all the user
wants is to retrieve all the names/uuids of each snapshot, and the
indices that went into each snapshot, without any of the other status
information about the snapshot.  This minimal information can be
retrieved from the repository index blob (`index-N`) without needing to
read each snapshot metadata blob.

This commit enhances the get snapshots API with an optional `verbose`
parameter.  If `verbose` is set to false on the request, then the get
snapshots API will only retrieve the minimal information about each
snapshot (the name, uuid, and indices in the snapshot), and only read
this information from the repository index blob, thereby giving users
the option to retrieve the snapshots in a repository in a more
cost-effective and efficient manner.

Closes #24288
2017-05-10 15:48:40 -04:00
Ali Beyad a4b37bf7fe [DOCS] Multiple clusters connected to the same repository (#23807) 2017-03-30 13:08:41 -04:00
Igor Motov 63e1403017 Docs: add description of possible snapshot states 2017-03-23 15:20:38 -04:00
Ali Beyad 71739623d3 Consolify snapshot documentation (#23189)
This commit brings the snapshot documentation in conformity
with the CONSOLE format, and fixes the docs so that the documentation
tests can be run against them.
2017-02-15 18:13:27 -05:00
Lucas Bremgartner 0086b99797 [Docs] Correct setting name in snapshot/restore documentation (#22023)
There is no setting include_cluster_state for snapshot restore. The correct name for this setting is include_global_state.
2016-12-07 14:12:10 +01:00
Nik Everett 3bba7dbe07 Docs: note about snapshot version compatibility (#20896)
It is important that folks understand that snapshot/restore isn't
for archiving. It is appropriate for backup and disaster recovery
but not for archival over long periods of time because of version
incompatibility.

Closes #20866
2016-10-13 10:49:32 -04:00
Pascal Borreli fcb01deb34 Fixed typos (#20843) 2016-10-10 14:51:47 -06:00
Dominik Stadler f0db4d9942 Add an example call of how to stop a snapshot or restore operation (#20153) 2016-08-25 13:01:04 +02:00
Clinton Gormley 15c96df5b5 Replaced "true" with true in snapshot restore docs
Closes #19947
2016-08-12 16:56:56 +02:00
Pius 40f3b5ab76 Update snapshots.asciidoc
>However, the version of the new cluster should be the same or newer than the cluster that was

Afaik, you can't restore a snapshot to a newer cluster that is not consecutively newer (i.e. can't restore 1.x snapshot to a 5.x cluster).  This is to clarify the statement above moving forward.
2016-08-09 18:22:29 -07:00
Eric Sherman 0660de7472 Update snapshots.asciidoc (#18923) 2016-06-20 16:00:59 +02:00
Aaron Mildenstein 41810bd63c Pluralize "index" (#18811)
This doesn't just happen to "an index" unless you're restoring just one.  It reads better this way, IMO.
2016-06-13 20:05:33 +02:00
Lee Hinman c637fea84b Change the default of `include_global_state` from true to false for restores
This changes the default value to be false *only* for restore operations.

Resolves #18569
2016-06-08 10:48:36 -06:00
Tanguy Leroux 35d3bdab84 Add Google Cloud Storage repository plugin
Closes #12880
2016-05-19 13:26:23 +02:00
Clinton Gormley 3f594089c2 Renamed all AUTOSENSE snippets to CONSOLE (#18210) 2016-05-09 15:42:23 +02:00
Vector241-Eric d8948bae5b Fix a quick documentation typo.
Fix the object of the sentence to agree with the plural verb.
2016-03-03 16:36:21 -07:00
gaelL 58355430f6 snapshots doc: Repository Verification `verify` example fix
Replace `my_backup` in the example line by `s3_repository` to match
the example above.
2016-02-16 14:15:21 +01:00
Clinton Gormley b8b3a7aa4f Merge pull request #16236 from dlangille/patch-1
Better wording
2016-01-26 17:55:31 +01:00
David Pilato bc181c0310 Fix s3 with hdfs 2016-01-11 09:56:17 +01:00
David Pilato c514b26c8b Change link for HDFS support to plugins docs
I came to this change when I read https://github.com/elastic/elasticsearch/pull/15591

HDFS plugin link has not been updated when we moved HDFS to elasticsearch repository (#15192).
2016-01-11 09:43:00 +01:00
Yannick Welsch db5594a2af Support wildcards for getting repositories or snapshots
Closes #15151
2015-12-11 10:20:27 +01:00
Xu Zhang 2e6d72de27 Catch exception when reading corrupted snapshot.
Single corrupted snapshot file shouldn't prevent listing all other
snapshot in repository.
2015-11-18 21:43:46 -08:00
Jason O'Donnell 86c21f3c39 Fixing typo 2015-10-26 16:49:05 -04:00