OpenSearch

History

Lee Hinman fb0461ac76 [7.x] Add Snapshot Lifecycle Management (#44382 ) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It does record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "/30 * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}		2019-07-16 07:37:13 -06:00
..
community-clients	Update community client and integration docs (#41513 )	2019-04-26 08:57:14 +02:00
groovy-api	Make sure to use the type _doc in the REST documentation. (#34662 )	2018-10-22 11:54:04 -07:00
java-api	Reindex max_docs parameter name (#42942 )	2019-06-07 12:16:36 +02:00
java-rest	[7.x] Add Snapshot Lifecycle Management (#44382 )	2019-07-16 07:37:13 -06:00
painless	Add Datetime Now to Painless Documentation (#43852 )	2019-07-02 15:43:34 -07:00
perl	[DOCS] Various spelling corrections (#37046 )	2019-01-07 14:44:12 +01:00
plugins	Update discovery-ec2 docs (#43693 )	2019-07-11 12:59:38 +01:00
python	Update version numbers in Elasticsearch-Py docs (#40355 )	2019-04-02 12:16:24 -04:00
reference	[7.x] Add Snapshot Lifecycle Management (#44382 )	2019-07-16 07:37:13 -06:00
resiliency	[DOCS] Fix broken links for 7.0 release (#41036 )	2019-04-09 18:20:08 -04:00
ruby	[DOCS] Various spelling corrections (#37046 )	2019-01-07 14:44:12 +01:00
src/test	Wait for pending tasks in docs tests cleanup (#44123 )	2019-07-15 12:04:27 +01:00
README.asciidoc	Add clarification around TESTSETUP docs and error message (#43306 )	2019-07-16 14:58:16 +02:00
Versions.asciidoc	Upgrade to lucene-8.2.0-snapshot-860e0be5378 (#44171 ) (#44184 )	2019-07-11 09:17:22 -05:00
build.gradle	[7.x] Add Snapshot Lifecycle Management (#44382 )	2019-07-16 07:37:13 -06:00

README.asciidoc

The Elasticsearch docs are in AsciiDoc format and can be built using the
Elasticsearch documentation build process.

See: https://github.com/elastic/docs

Snippets marked with `// CONSOLE` are automatically annotated with "VIEW IN
CONSOLE" and "COPY AS CURL" in the documentation and are automatically tested
by the command `gradle :docs:check`. To test just the docs from a single page,
use e.g. `./gradlew :docs:integTestRunner --tests "*rollover*"`.

NOTE: If you have an elasticsearch-extra folder alongside your elasticsearch
folder, you must temporarily rename it when you are testing 6.3 or later branches.

By default each `// CONSOLE` snippet runs as its own isolated test. You can
manipulate the test execution in the following ways:

* `// TEST`: Explicitly marks a snippet as a test. Snippets marked this way
are tests even if they don't have `// CONSOLE` but usually `// TEST` is used
for its modifiers:
* `// TEST[s/foo/bar/]`: Replace `foo` with `bar` in the generated test. This
should be used sparingly because it makes the snippet "lie". Sometimes,
though, you can use it to make the snippet more clear. Keep in mind that
if there are multiple substitutions then they are applied in the order that
they are defined.
* `// TEST[catch:foo]`: Used to expect errors in the requests. Replace `foo`
with `request` to expect a 400 error, for example. If the snippet contains
multiple requests then only the last request will expect the error.
* `// TEST[continued]`: Continue the test started in the last snippet. Between
tests the nodes are cleaned: indexes are removed, etc. This prevents that
from happening between snippets because the two snippets are a single test.
This is most useful when you have text and snippets that work together to
tell the story of some use case because it merges the snippets (and thus the
use case) into one big test.
* `// TEST[skip:reason]`: Skip this test. Replace `reason` with the actual
reason to skip the test. Snippets without `// TEST` or `// CONSOLE` aren't
considered tests anyway but this is useful for explicitly documenting the
reason why the test shouldn't be run.
* `// TEST[setup:name]`: Run some setup code before running the snippet. This
is useful for creating and populating indexes used in the snippet. The setup
code is defined in `docs/build.gradle`. See `// TESTSETUP` below for a
similar feature.
* `// TEST[warning:some warning]`: Expect the response to include a `Warning`
header. If the response doesn't include a `Warning` header with the exact
text then the test fails. If the response includes `Warning` headers that
aren't expected then the test fails.
* `// TESTRESPONSE`: Matches this snippet against the body of the response of
the last test. If the response is JSON then order is ignored. If you add
`// TEST[continued]` to the snippet after `// TESTRESPONSE` it will continue
in the same test, allowing you to interleave requests with responses to check.
* `// TESTRESPONSE[s/foo/bar/]`: Substitutions. See `// TEST[s/foo/bar]` for
how it works. These are much more common than `// TEST[s/foo/bar]` because
they are useful for eliding portions of the response that are not pertinent
to the documentation.
* One interesting difference here is that you often want to match against
the response from Elasticsearch. To do that you can reference the "body" of
the response like this: `// TESTRESPONSE[s/"took": 25/"took": $body.took/]`.
Note the `$body` string. This says "I don't expect that 25 number in the
response, just match against what is in the response." Instead of writing
the path into the response after `$body` you can write `$_path` which
"figures out" the path. This is especially useful for making sweeping
assertions like "I made up all the numbers in this example, don't compare
them" which looks like `// TESTRESPONSE[s/\d+/$body.$_path/]`.
* You can't use `// TESTRESPONSE` immediately after `// TESTSETUP`. Instead,
consider using `// TEST[continued]` or rearrange your snippets.
* `// TESTRESPONSE[non_json]`: Add substitutions for testing responses in a
format other than JSON. Use this after all other substitutions so it doesn't
make other substitutions difficult.
* `// TESTRESPONSE[skip:reason]`: Skip the assertions specified by this
response.
* `// TESTSETUP`: Marks this snippet as the "setup" for all other snippets in
this file. This is a somewhat natural way of structuring documentation. You
say "this is the data we use to explain this feature" then you add the
snippet that you mark `// TESTSETUP` and then every snippet will turn into
a test that runs the setup snippet first. See the "painless" docs for a file
that puts this to good use. This is fairly similar to `// TEST[setup:name]`
but rather than the setup defined in `docs/build.gradle` the setup is defined
right in the documentation file. In general, we should prefer `// TESTSETUP`
over `// TEST[setup:name]` because it makes it more clear what steps have to
be taken before the examples will work. Tip: `// TESTSETUP` can only be used
on the first snippet of a document.
* `// NOTCONSOLE`: Marks this snippet as neither `// CONSOLE` nor
`// TESTRESPONSE`, excluding it from the list of unconverted snippets. We
should only use this for snippets that *are* JSON but are *not* responses or
requests.

In addition to the standard CONSOLE syntax these snippets can contain blocks
of yaml surrounded by markers like this:

```
startyaml
- compare_analyzers: {index: thai_example, first: thai, second: rebuilt_thai}
endyaml
```

This allows slightly more expressive testing of the snippets. Since that syntax
is not supported by CONSOLE the usual way to incorporate it is with a
`// TEST[s//]` marker like this:

```
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: thai_example, first: thai, second: rebuilt_thai}\nendyaml\n/]
```

Any place you can use json you can use elements like `$body.path.to.thing`
which is replaced on the fly with the contents of the thing at `path.to.thing`
in the last response.