OpenSearch

Commit Graph

Author	SHA1	Message	Date
William Brafford	49e30b15a2	Deprecate disabling basic-license features (#54816 ) (#55405 ) We believe there's no longer a need to be able to disable basic-license features completely using the "xpack..enabled" settings. If users don't want to use those features, they simply don't need to use them. Having such features always available lets us build more complex features that assume basic-license features are present. This commit deprecates settings of the form "xpack..enabled" for basic-license features, excluding "security", which is a special case. It also removes deprecated settings from integration tests and unit tests where they're not directly relevant; e.g. monitoring and ILM are no longer disabled in many integration tests.	2020-04-17 15:04:17 -04:00
Andrei Dan	d918ef0da9	[Tests] Enable searchable_snapshots for non-snapshot builds (#55151 ) (#55157 ) Fixes https://github.com/elastic/elasticsearch/issues/55050 (cherry picked from commit 13391ceff1cbf6db69706c5f46127b6ff8850a1f) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-14 16:13:39 +01:00
Dimitrios Liappis	b062535e27	Mute testSearchableSnapshotAction in TimeSeriesLifecycleActions tests (#55055 ) Backport of #55052 Details in #55050	2020-04-10 16:03:09 +03:00
Andrei Dan	bbc57828c4	ILM fix retry delete action test (#54809 ) (#54895 ) Asserting on the failed_step field from the explainAPI can produce flakiness because the ILM state is moved back and forth between the (failing) step and the ERROR step (as the workflow is retry, fail then move to ERROR step, move back to the (failing) step, retry, fail, etc) and the failed_step information is only available whilst in the ERROR state. Unmute other tests as they were collateral failures A read-only index could not be deleted in the wipeCluster phase and caused these failures (cherry picked from commit 99a6d57aeb3cf11abc38b514f38a96bb1612e357) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-07 15:55:56 +01:00
Tanguy Leroux	4d36917e52	Merge feature/searchable-snapshots branch into 7.x (#54803 ) (#54825 ) This is a backport of #54803 for 7.x. This pull request cherry picks the squashed commit from #54803 with the additional commits: 6f50c92 which adjusts master code to 7.x a114549 to mute a failing ILM test (#54818) 48cbca1 and 50186b2 that cleans up and fixes the previous test aae12bb that adds a missing feature flag (#54861) 6f330e3 that adds missing serialization bits (#54864) bf72c02 that adjust the version in YAML tests a51955f that adds some plumbing for the transport client used in integration tests Co-authored-by: David Turner <david.turner@elastic.co> Co-authored-by: Yannick Welsch <yannick@welsch.lu> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-07 13:28:53 +02:00
Jason Tedor	5fcda57b37	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 17:24:38 -04:00
David Turner	ad3c96e250	AwaitsFix for #54093	2020-03-26 13:24:33 +00:00
David Turner	53e2fec93d	AwaitsFix for #53612	2020-03-26 10:41:37 +00:00
Mark Vieira	7728ccd920	Encore consistent compile options across all projects (#54120 ) (cherry picked from commit ddd068a7e92dc140774598664efdc15155ab05c2)	2020-03-25 08:24:21 -07:00
Ioannis Kakavas	4a36894a48	Mute failing tests (#53781 ) See #53738	2020-03-19 08:16:23 +02:00
Ioannis Kakavas	af519cccff	Revert "Mute TimeSeriesLifecycleActionsIT (#53741 )" This reverts commit `df0ad7569b`.	2020-03-18 18:51:06 +02:00
Ioannis Kakavas	df0ad7569b	Mute TimeSeriesLifecycleActionsIT (#53741 ) see #53738	2020-03-18 17:38:24 +02:00
Ioannis Kakavas	e5aa0906f7	Mute testHistoryIsWrittenWithDeletion (#53721 ) see #53718	2020-03-18 14:49:57 +02:00
David Kyle	a38e5ca8e7	Mute TimeSeriesLifecycleActionsIT.testHistoryIsWrittenWithFailure (#53595 ) Failure tracked in #50353	2020-03-16 12:30:56 +00:00
Przemko Robakowski	f075d70cf8	[7.x] Avoid race condition in ILMHistorySotre (#53039 ) (#53094 ) * Avoid race condition in ILMHistorySotre (#53039) * Avoid race condition in ILMHistorySotre This change modifies ILMHistoryStore to always apply correct settings and mappings, even if template is deleted and not yet recreated. This ensures that ILM history index is correctly managed by ILM and also fixes flaky history tests that were prone to triggenring this race. This commit also refactors and simplifies ILM history tests. Closes #50353 and #52853 * Review comment Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> * fixed tests * backport #53306 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-09 22:24:15 +01:00
Przemko Robakowski	aff693bc9f	Make FreezeStep retryable (#52540 ) (#52559 ) * Make FreezeStep retryable This change marks `FreezeStep` as retryable and adds test to make sure we can really run it again. * refactor tests Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-02-21 10:11:35 +01:00
Przemko Robakowski	88bb06f055	Make DeleteStep retryable (#52494 ) (#52532 ) * Make DeleteStep retryable This change marks `DeleteStep` as retryable and adds test to make sure we really can invoke it again. * Fix unused import * revert unneeded changes * test reworked	2020-02-19 21:16:59 +01:00
Przemko Robakowski	d467c50e90	Make TimeSeriesLifecycleActionsIT.testWaitForSnapshot and testWaitForSnapshotSlmExecutedBefore wait for snaphost (#51892 ) (#52419 ) * waitForSnapshot tests rework * Refactor assertBusy Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-02-18 11:01:42 +01:00
Andrei Dan	bd3a70db4e	ILM fix the init step to actually be retryable (#52076 ) (#52375 ) We marked the `init` ILM step as retryable but our test used `waitUntil` without an assert so we didn’t catch the fact that we were not actually able to retry this step as our ILM state didn’t contain any information about the policy execution (as we were in the process of initialising it). This commit manually sets the current step to `init` when we’re moving the ilm policy into the ERROR step (this enables us to successfully move to the error step and later retry the step) * ShrunkenIndexCheckStep: Use correct logger (cherry picked from commit f78d4b3d91345a2a8fc0f48b90dd66c9959bd7ff) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-02-15 18:42:05 +00:00
Andrei Dan	da2d441d50	ILM make the set-single-node-allocation retryable (#52077 ) (#52138 ) (cherry picked from commit 0e473115958f691fc8dc87293642aea6a07fe3da) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-02-14 17:31:24 +00:00
Lee Hinman	0be61a3662	[7.x] Adding best_compression (#49974 ) (763480ee) (#51819 ) * Adding best_compression (#49974) This commit adds a `codec` parameter to the ILM `forcemerge` action. When setting the codec to `best_compression` ILM will close the index, then update the codec setting, re-open the index, and finally perform a force merge. * Fix ForceMergeAction toSteps construction (#51825) There was a duplicate force merge step and the test continued to fail. This commit clarifies the `toStep` method and changes the `assertBestCompression` method for better readability. Resolves #51822 * Update version constants Co-authored-by: Sivagurunathan Velayutham <sivadeva.93@gmail.com>	2020-02-04 14:15:43 -07:00
Lee Hinman	4594a210bf	[7.x] Fix SnapshotLifecycleRestIT.testFullPolicySnapshot (#517… (#51778 ) * Fix SnapshotLifecycleRestIT.testFullPolicySnapshot This previously was missing some key information in the output of the failure. This captures that information and adds logging at each step so we can determine the cause if it fails again. Resolves #50358	2020-01-31 15:38:28 -07:00
Lee Hinman	deefc85d60	[7.x] Stop policy on last PhaseCompleteStep instead of Termina… (#51758 ) Currently when an ILM policy finishes its execution, the index moves into the `TerminalPolicyStep`, denoted by a completed/completed/completed phase/action/step lifecycle execution state. This commit changes the behavior so that the index lifecycle execution state halts at the last configured phase's `PhaseCompleteStep`, so for instance, if an index were configured with a policy containing a `hot` and `cold` phase, the index would stop at the `cold/complete/complete` `PhaseCompleteStep`. This allows an ILM user to update the policy to add any later phases and have indices configured to use that policy pick up execution at the newly added "later" phase. For example, if a `delete` phase were added to the policy specified about, the index would then move from `cold/complete/complete` into the `delete` phase. Relates to #48431	2020-01-31 10:36:41 -07:00
David Roberts	e0e35b7feb	[TEST] Mute TimeSeriesLifecycleActionsIT.testWaitForSnapshotSlmExecutedBefore Due to https://github.com/elastic/elasticsearch/issues/50781	2020-01-29 13:08:55 +01:00
Andrei Dan	977cce002e	Preserve slm-history-ilm-policy between test runs (#51442 ) (#51468 ) (cherry picked from commit 4e95c8a94fa700d44ac31ef17547512748ab1885) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-01-27 10:40:40 +00:00
Andrei Dan	d872db278a	Fix TimeSeriesLifecycleActionsIT.testShrinkAction (#51431 ) (#51467 ) * Fix TimeSeriesLifecycleActionsIT.testShrinkAction Shrinking a 6 shard index to 3 shards can be quite time consuming and assertBusy probes the conditions at exponentially growing intervals. This separates the one assertion that was used for all the conditions into multiple assertBusy statements and increases the timeout for waiting for the shrink to complete. * Allow more time for shrink to complete This commit allows more time for the shrink operation to complete in testRetryFailedShrinkAction (separating the assertBusy calls too) and testMoveToRolloverStep. * Shrink to no more than 2 shards in tests (cherry picked from commit 5fe780148fa3536915d61475b087896a5b9ace82) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-01-27 10:40:29 +00:00
Lee Hinman	8560847dd9	[7.x] Check all snapshots in SnapshotLifecycleRestIT.testFullP… (#51448 ) * Check all snapshots in SnapshotLifecycleRestIT.testFullPolicy Rather than check the first returned snapshot for a snapshot starting with `snap-` in SnapshotLifecycleRestIT.testFullPolicy, this commit changes the test to find any snapshots starting with `snap-`. In the event that there are no snapshots (the failure case), this also exposes the full results map so we can diagnose why a failure occurred. Relates to #50358 * Use a more imperative style for checking	2020-01-24 14:30:42 -07:00
Lee Hinman	bdb8b6aa0d	[7.x] Separate aliases used for tests in TimeSeriesLifecycleAc… (#51432 ) * Separate aliases used for tests in TimeSeriesLifecycleActionsIT This is related to #51375 and hopes to help illuminate why some of those tests are failing. This commit switches the aliases used in the test to use a random alias name every time (since there were some complaints in the tests about aliases having more than one write index). With this we hope to determine the actual cause of the failure in the test. This also adds additional information to the exception returned when calling move-to-step with the incorrect current step. * Fix rest test	2020-01-24 11:05:19 -07:00
Andrei Dan	123266714b	ILM wait for active shards on rolled index in a separate step (#50718 ) (#51296 ) After we rollover the index we wait for the configured number of shards for the rolled index to become active (based on the index.write.wait_for_active_shards setting which might be present in a template, or otherwise in the default case, for the primaries to become active). This wait might be long due to disk watermarks being tripped, replicas not being able to spring to life due to cluster nodes reconfiguration and others and, the RolloverStep might not complete successfully due to this inherent transient situation, albeit the rolled index having been created. (cherry picked from commit 457a92fb4c68c55976cc3c3e2f00a053dd2eac70) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-01-22 11:01:52 +00:00
Tim Vernum	a0ca82422c	Mute TimeSeriesLifecycleActionsIT.waitForSnapshot (#51208 ) This test was recently un-muted, but is still failing Relates: #50781 Backport of: #51203	2020-01-20 20:19:29 +11:00
Lee Hinman	731c96b507	[7.x] Use separate policies for tests in SnapshotLifecycleRest… (#51181 ) These policies store statistics, but since stats updating is asynchronous, it's possible for the update from one test to bleed into a separate one. This change switches the tests to use separate policy ids so that their stats are tracked independently. It also relaxes the checking constraint in one of the tests. Hopefully this: Resolves #48531 Resolves #48017	2020-01-17 13:26:40 -07:00
Lee Hinman	ad60f0015e	Address failures in SnapshotLifecycleRestIT.testFullPolicySnapshot (#51013 ) This test failed a couple of different ways, related to timing, as well as concurrent snapshots, and also naming. This commit splits the giant `assertBusy` into separate parts so that we don't perform ~5 different requests and tests in the same loop. It also gives each test a unique repository so that no other test can accidentally re-use snapshots. Resolves #50358 (hopefully!)	2020-01-15 09:47:41 -07:00
David Kyle	69a3626ee1	Mute SnapshotLifecycleRestIT testFullPolicySnapshot Relates to #50358	2020-01-14 13:46:37 +01:00
Przemko Robakowski	a18736b46d	[7.x] ILM action to wait for SLM policy execution (#50454 ) (#50943 ) * ILM action to wait for SLM policy execution (#50454) This change add new ILM action to wait for SLM policy execution to ensure that index has snapshot before deletion. Closes #45067 * Fix flaky TimeSeriesLifecycleActionsIT#testWaitForSnapshot test This change adds some randomness and cleanup step to TimeSeriesLifecycleActionsIT#testWaitForSnapshot and testWaitForSnapshotSlmExecutedBefore tests in attempt to make them stable. Reletes to #50781 * Formatting changes * Longer timeout * Fix Map.of in Java8 * Unused import removed	2020-01-14 01:34:33 +01:00
Lee Hinman	91689e793d	[7.x] Refresh cached phase policy definition if possible on ne… (#50941 ) * Refresh cached phase policy definition if possible on new policy There are some cases when updating a policy does not change the structure in a significant way. In these cases, we can reread the policy definition for any indices using the updated policy. This commit adds this refreshing to the `TransportPutLifecycleAction` to allow this. It allows us to do things like change the configuration values for a particular step, even when on that step (for example, changing the rollover criteria while on the `check-rollover-ready` step). There are more cases where the phase definition can be reread that just the ones checked here (for example, removing an action that has already been passed), and those will be added in subsequent work. Relates to #48431	2020-01-13 14:31:41 -07:00
Lee Hinman	8dc6e98819	[7.x] Make InitializePolicyContextStep retryable (#50685 ) (#50760 ) This commits makes the "init" ILM step retryable. It also adds a test where an index is created with a non-parsable index name and then fails. Related to #48183	2020-01-08 13:13:57 -07:00
Lee Hinman	615532b4f8	Mute TimeSeriesLifecycleActionsIT.testHistoryIsWritten* (#50755 ) Related to #50353	2020-01-08 10:35:44 -07:00
Andrei Dan	3915d4c055	Make the UpdateRolloverLifecycleDateStep retryable (#50702 ) (#50730 ) This makes the "update-rollover-lifecycle-date" step, which is part of the rollover action, retryable. It also adds an integration test to check the step is retried and it eventually succeeds. (cherry picked from commit 5bf068522deb2b6cd2563bcf80f34fdbf459c9f2) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-01-08 11:45:26 +01:00
Lee Hinman	552edd862e	[7.x] Add aditional logging for ILM history store tests (#5062… (#50678 ) * Add aditional logging for ILM history store tests (#50624) These tests use the same index name, making it hard to read logs when diagnosing the failures. Additionally more information about the current state of the index could be retrieved when failing. This changes these two things in the hope of capturing more data about why this fails on some CI nodes but not others. Relates to #50353	2020-01-06 15:24:24 -07:00
Christoph Büscher	6c8868e955	Mute TimeSeriesLifecycleActionsIT.testHistoryIsWrittenWithSuccess Also muting TimeSeriesLifecycleActionsIT.testHistoryIsWrittenWithFailure. Tracked in #50353	2020-01-03 18:32:03 +01:00
Andrei Dan	3c971f2911	ILM retryable async action steps (#50522 ) (#50591 ) This adds support for retrying AsyncActionSteps by triggering the async step after ILM was moved back on the failed step (the async step we'll be attempting to run after the cluster state reflects ILM being moved back on the failed step). This also marks the RolloverStep as retryable and adds an integration test where the RolloverStep is failing to execute as the rolled over index already exists to test that the async action RolloverStep is retried until the rolled over index is deleted. (cherry picked from commit 8bee5f4cb58a1242cc2ef4bc0317dae6c8be49d3) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-01-03 16:19:58 +02:00
Lee Hinman	c3c9ccf61f	[7.x] Add ILM histore store index (#50287 ) (#50345 ) * Add ILM histore store index (#50287) * Add ILM histore store index This commit adds an ILM history store that tracks the lifecycle execution state as an index progresses through its ILM policy. ILM history documents store output similar to what the ILM explain API returns. An example document with ALL fields (not all documents will have all fields) would look like: ```json { "@timestamp": 1203012389, "policy": "my-ilm-policy", "index": "index-2019.1.1-000023", "index_age":123120, "success": true, "state": { "phase": "warm", "action": "allocate", "step": "ERROR", "failed_step": "update-settings", "is_auto-retryable_error": true, "creation_date": 12389012039, "phase_time": 12908389120, "action_time": 1283901209, "step_time": 123904107140, "phase_definition": "{\"policy\":\"ilm-history-ilm-policy\",\"phase_definition\":{\"min_age\":\"0ms\",\"actions\":{\"rollover\":{\"max_size\":\"50gb\",\"max_age\":\"30d\"}}},\"version\":1,\"modified_date_in_millis\":1576517253463}", "step_info": "{... etc step info here as json ...}" }, "error_details": "java.lang.RuntimeException: etc\n\tcaused by:etc etc etc full stacktrace" } ``` These documents go into the `ilm-history-1-00000N` index to provide an audit trail of the operations ILM has performed. This history storage is enabled by default but can be disabled by setting `index.lifecycle.history_index_enabled` to `false.` Resolves #49180 * Make ILMHistoryStore.putAsync truly async (#50403) This moves the `putAsync` method in `ILMHistoryStore` never to block. Previously due to the way that the `BulkProcessor` works, it was possible for `BulkProcessor#add` to block executing a bulk request. This was bad as we may be adding things to the history store in cluster state update threads. This also moves the index creation to be done prior to the bulk request execution, rather than being checked every time an operation was added to the queue. This lessens the chance of the index being created, then deleted (by some external force), and then recreated via a bulk indexing request. Resolves #50353	2019-12-20 12:33:36 -07:00
Andrei Dan	085d08cfd1	ILM Remove obsolete testRolloverAlreadyExists (#49104 ) (#49144 ) The rollover action is now a retryable step (see #48256) so ILM will keep retrying until it succeeds as opposed to stopping and moving the execution in the ERROR step. Fixes #49073 (cherry picked from commit 3ae90898121b43032ec8f3b50514d93a86e14d0f) Signed-off-by: Andrei Dan <andrei.dan@elastic.co> # Conflicts: # x-pack/plugin/ilm/qa/multi-node/src/test/java/org/elasticsearch/xpack/ilm/TimeSeriesLifecycleActionsIT.java	2019-11-15 12:06:22 +00:00
Rory Hunter	c46a0e8708	Apply 2-space indent to all gradle scripts (#49071 ) Backport of #48849. Update `.editorconfig` to make the Java settings the default for all files, and then apply a 2-space indent to all `*.gradle` files. Then reformat all the files.	2019-11-14 11:01:23 +00:00
Lee Hinman	5eb37c29fe	[7.x] Re-read policy phase JSON when using ILM's move-to-step… (#49011 ) When using the move-to-step API, we should reread the phase JSON from the latest version of the ILM policy. This allows a user to move to the same step while re-reading the policy's latest version. For example, when changing rollover criteria. While manually messing around with some other things I discovered that we only reread the policy when using the retry API, not the move-to-step API. This commit changes the move-to-step API to always read the latest version of the policy.	2019-11-12 19:41:06 -07:00
Andrei Dan	98a9227588	Fix TimeSeriesLifecycleActionsIT.testRolloverAlreadyExists (#48747 ) (#48795 ) * ILM Test asserts on the same ilm/_explain output With the introduction of retryable steps subsequent ilm/_explain calls can see the state of an ilm cycle move out of the error step. This test made several assertions assuming that the cycle remains in the error step so this commit changes the test to make one _explain call and have all the asserts work on the same ilm state (so subsequent assumptions to the cycle being in the error step are valid). * Drop unused field in test. (cherry picked from commit 44c74bb487151c886a08b27f32b13f7a72056997) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-11-01 12:34:33 +00:00
Lee Hinman	d0ead688c3	[7.x] Fix TimeSeriesLifecycleActionsIT.testExplainFilters (#48… (#48776 ) This test used an index without an alias to simulate a failure in the `check-rollover-ready` step. However, with #48256 that step automatically retries, meaning that the index may not always be in the ERROR step. This commit changes the test to use a shrink action with an invalid number of shards so that it stays in the ERROR step. Resolves #48767	2019-10-31 15:25:12 -06:00
Andrei Dan	ffe5d5417f	ILM Make the `check-rollover-ready` step retryable (#48256 ) (#48740 ) This adds the infrastructure to be able to retry the execution of retryable steps and makes the `check-rollover-ready` retryable as an initial step to make the rollover action more resilient to transient errors. (cherry picked from commit 454020ac8acb147eae97acb4ccd6fb470d1e5f48) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-10-31 11:28:55 +00:00
Gordon Brown	50d7424e7d	Unmute and increase logging on flaky SLM tests (#48612 ) The failures in these tests have been remarkably difficult to track down, in part because they will not reproduce locally. This commit unmutes the flaky tests and increases logging, as well as introducing some additional logging, to attempt to pin down the failures.	2019-10-29 13:39:19 -07:00
Gordon Brown	cf235796c0	Use more reliable "never run" cron pattern in tests (#48608 ) The cron schedule "1 2 3 4 5 ?" will run every May 4 at 03:02:01, which may result in unnecessary test failures once a year. This commit switches out uses of that schedule in tests for one which will never execute (because it specifies a day which doesn't exist, Feb. 31). Also factors the schedule out to a constant to make the intent clearer.	2019-10-29 09:33:14 -07:00

1 2

95 Commits