Previously, phase X's `after` step had `X` as its associated phase. This causes confusion because
we have only entered phase `X` once the `after` step is complete. Therefore, this refactor
pushes the after's phase to be associated with the previous phase. This first phase is an exception.
The first phase's `after` step is associated with the first phase (not some non-existent prior phase).
- Renames EnoughShardsWaitStep to ShrunkShardsAllocatedStep
- Changes ShrunkShardsAllocatedStep to check the shards of the shrunken index rather than the current one
- shrink index prefix is now passed into the steps of the shrink aciton
- Related Test Changes
Adds some missing tests including checking the hashcode and equals methods of `DeleteStep`, `StepKey`, and `TerminalPolicyStep` as well as adding a test for `DeleteAction.toSteps()`
- added ShrinkStep/Tests
- AsyncActionStep now passes in IndexMetaData instead of Index
- Delete usage of ClusterStateActionStep
- with ClusterStateActionStep gone, InitializePolicyContextStep
is the only other ClusterState-nonWait step
- Migrate setting-updates to UpdateSettingsStep
Also renames EnoughShardsWaitStep to ReplicasAllocatedStep, removes it from the allocate action and adds a check that th number of replicas in the cluster state is correct to it.
Various classes had some code that was not used and is not going to be needed so this change cleans up those classes so we don’t have dead code hanging around
The force-merge is an a TODO state due to the
unresolved issue around best_compression.
- updated ReadOnlyStep with tests
- implemented an update to the ForceMergeAction
- added UpdateBestCompressionSettingsStep
- added tests for SegmentCountStep
Looks like we need to split out the tests of core classes to core
and index-lifecycle ones stay in index-lifecycle.
I believe I got everything, although I may have missed at least one thing
checked status with
$ ./gradlew :x-pack-elasticsearch:plugin:index-lifecycle:check -Dtests.seed=39838421912001B4
$ ./gradlew :x-pack-elasticsearch:plugin:core:check -Dtests.seed=39838421912001B4
other things done in this PR:
- removal of a few unused variables/thrown exceptions/imports
- fix TimeseriesLifecycleTypeTests
- an all null AllocateAction was created
- fix AllocateActionTests
- woops. -Dtests.seed=39838421912001B4 resulted in two `null`s and an emptyMap.
this resulted in a test failure.
Specifically this change makes it optional to:
* Specify `includes`, `excludes` and `requires`maps in the allocate action as long as at least one fo the options is specified and is not an empty map
* Specify an `after` parameter on a phase. If no `after` value is specified `TimeValue.ZERO` is used and the phase will be moved to as soon as the previous phase reports `ACTIONS COMPLETED`. `after` is always non-null when we are serialising the Phase.
* Specify a `type` for a LifecyclePolicy. If no `type` is specified `TimeSeriesLifecycleType.INSTANCE` is used since this is currently the only production `type`. `type` is always non-null when we are serialising the LifecyclePolicy.
This PR adds a new setting called `index.lifecycle.date` that
the ShrinkAction will be responsible for populating in the newly created index.
This way, we can continue to know when we should be executing the next phase
relative to the original index creation date, and not that of the shrunken index.
A node can stop being the master node whilst it is running, e.g. if it can’t access `minimumMasterNodes` number of master eligible nodes. Because of this we need logic in `IndexLifecycleService` that cancels the scheduled job if the node is no longer master and re-adds the job if the node becomes master again.
This does the following in sequential service polls
1. sets the index to read-only and runs shrink with a modified `index.lifecycle.name` setting set to `null`.
2. checks to see if shrink is complete, if it is...
b. set target index's `index.lifecycle.*` settings to the original index's values.
3. if not complete, just wait till next iteration
4. if operating on shrunken index, delete old index and add it as an alias to shrunken index
* Adds Allocate lifcycle action
* Addresses review comments
Still need to make a change in core for the FilterAllocationDecider to make the execute logic simpler
* Addresses more review comments
* Adds randomMap method to AllocateActionTests
* Addresses further review comments
* Improves handling of exceptions in Index Lifecycle
This change improves a few different aspects:
* If an exception occurs executing the lifecycle of one index it is caught, logged and other indexes are still processed
* If the lifecycle policy specified in the settings does not exist an error is logged
* Fixes the exception when the delete action is run which occurs because Phase attempts to update the phase and action settings for the deleted index. A `LifecycleAction.indexSurvives()` method is introduced which defaults to `true` but can be overridden to indicate whether the index survives following completion of the action.
* Adds test
* Fix InternalIndexLifecycleContext to update state in memory
The internal and the mock index-lifecycle-context implementations differed
in that the InternalIndexLifecycleContext assumed no one would be using it after
it mutated state. This is not the case. We assume that the current context is updated after
a `#setAction` is called so that the listener can then appropriately use the newly modified
cluster state. since idxMeta was not being updated, any call to `context.getAction` was stale and
either returning null or the previous action, not the next action that was updated by `#setAction`.
Same goes for `setPhase`.
This PR should fix this so that the Mock and Internal implementations are more in line.
`IndexLifecycleInitialisationIT.testMasterFailover()` intermittently failed because the timeout of 10 seconds to check if the index had been deleted was not long enough sometimes with the poll interval set to 3 seconds. This change sets the poll interval to 1 seconds for the test so that the lifecycle is more responsive. This also means the default value for the poll interval can be safely changed without affecting the test.
This change moves the Action classes and referenced data model classes to the new :x-pack-elasticsearch:plugin:core project in preparation for splitting the x-pack features into their own gradle modules.
Note that the TransportAction classes had to be promoted to their own class file (rather than being inner classes to their Action) so they can remain in the plugin project (and will late be move to the `index-lifecycle` project when its created.
To clean up the parsing of the LifecyclePolicy this change moves the LifecycleType to its own class so it can be created in the normal parsing of LifecyclePolicy rather than having to parse to an intermediary object first. The LifecycleType is an interface which can be implemented for different lifecycle types. These types shiould be singletons and are register with the NamedXContentRegistry and NamedWriteableRegistry only so they are available when reading from a stream or parsing.
Removes the poll-interval from the IndexLifecycleMetadata and introduces it in
the form of a cluster setting that is configurable. Changes to this poll interval setting
will reflect in the Lifecycle Scheduler.
This includes changing NOCOMMIT comments to be NORELEASE comments so the build passes with them. We have tasks inGH for all these NORELEASE comments so they should be caught before merging to master
This action will rollover an index when executed if the provided conditions are met.
Users may specify the maximum age, maximum index size in bytes or maximum index size in number of documents as conditions for rollover.
When the action executes it firsts checks the local cluster state to find out if the alias exists on the index. If the alias does not exist then the index was either rolled over by a previous run or something else has rolled over the index so the action can be marked as completed. If the index still has the alias set the action will make a rollover index request using the Client. When that request returns and the listener is called the action will only be marked as complete if the response indicates the index was rolled over. If the index was not rolled over (because the conditions are not yet met) the action is not marked as complete and will be re-evaluated on the next call to execute.
This means that the result of the action can now be async and we can then implement moving immediately to the next action if the current one is complete
The client and index metadata have now been abstracted away from the Lifecycle classes behind IndexLifecycleContext. This allow us to test the state machine without having to worry about how the state is persisted and read. It also makes the classes much easier to read and reason about.
The lifecycles are stored as custom metadata objects in the cluster state. This change also cleans up the parsing of the lifecycle state so that it can be parsed properly
This test verifies that we have sufficient failover code so that
a newly elected master re-registers old schedules and fires them off.
All times are relative to the index creation date.
Feature consists of a shell of a persistant task which will later be used to inspect the index settings and apply curator like changes to the index (move from hot to warm, rollover, shrink etc.)