OpenSearch

Commit Graph

Author	SHA1	Message	Date
Armin Braun	16642f1c74	Handle RejectedExecutionException in ShardFollowTasksExecutor (#65648 ) (#65653 ) Follow-up to #65415. We can't have this exception bubble up in an exception handler any longer due to the new assertion so we must handle it here.	2020-12-01 06:51:05 +01:00
Nhat Nguyen	3989243a52	Stop renew retention leases when follow task fails (#65168 ) If a shard follow-task hits a non-retryable error and stops, then we should also stop the retention-leases renewal process associated with that follow-task.	2020-11-18 15:53:55 -05:00
Tanguy Leroux	e40d7e02ea	Makes testCcrRepositoryFetchesSnapshotShardSizeFromIndexShardStoreStats more robust (#64976 ) (#64989 ) Today this test fails because the sizes of the snapshot shards are only kept in a very short period of time in the InternalSnapshotsInfoService and are not guaranteed to exist once the shards are correctly assigned. closes #64167	2020-11-12 15:38:38 +01:00
Nhat Nguyen	aa0e3f85e6	Increase timeout in testCleanUpShardFollowTasksForDeletedIndices (#64562 ) If the deleted index has N shards, then ShardFollowTaskCleaner can send N*(N-1)/2 requests to remove N shard-follow tasks. I think that's fine as the implementation is straightforward. The test failed when the deleted index has 8 shards. This commit increases the timeout in the test. Closes #64311	2020-11-10 11:51:41 -05:00
Tanguy Leroux	2d0bddf428	Fix CcrRepositoryIT.testCcrRepositoryFetchesSnapshotShardSizeEtc (#64228 ) (#64479 ) This test failed sometimes for various reasons: an empty bulk request that can't be validated, a background force-merge that completes after the store stats were collected and finally an assertBusy() that waits 10 seconds while we usually wait 60s on the follower cluster in CCR tests. Closes #64167	2020-11-02 15:47:07 +01:00
Tanguy Leroux	57b5715bf7	Add CCR repository test for snapshot shard size (#63649 ) Following #61906 this commit adds two new integration tests that verifies the sizes of snapshotted shards for CCR repositories. Backport of #63590	2020-10-14 12:51:42 +02:00
Tanguy Leroux	87076c32e2	Determine shard size before allocating shards recovering from snapshots (#61906 ) (#63337 ) Determines the shard size of shards before allocating shards that are recovering from snapshots. It ensures during shard allocation that the target node that is selected as recovery target will have enough free disk space for the recovery event. This applies to regular restores, CCR bootstrap from remote, as well as mounting searchable snapshots. The InternalSnapshotInfoService is responsible for fetching snapshot shard sizes from repositories. It provides a getShardSize() method to other components of the system that can be used to retrieve the latest known shard size. If the latest snapshot shard size retrieval failed, the getShardSize() returns ShardRouting.UNAVAILABLE_EXPECTED_SHARD_SIZE. While we'd like a better way to handle such failures, returning this value allows to keep the existing behavior for now. Note that this PR does not address an issues (we already have today) where a replica is being allocated without knowing how much disk space is being used by the primary. Co-authored-by: Yannick Welsch <yannick@welsch.lu>	2020-10-06 18:37:05 +02:00
Nhat Nguyen	25fbc01459	Retry CCR shard follow task when no seed node left (#63225 ) If the connection between clusters is disconnected or the leader cluster is offline, then CCR shard-follow tasks can stop with "no seed node left". CCR should retry on this error.	2020-10-05 21:56:56 -04:00
Armin Braun	860791260d	Implement Shard Snapshot Clone Logic (#62771 ) (#63260 ) First part of the snapshot clone logic that implements the snapshot clone functionality on the repository level.	2020-10-05 22:55:52 +02:00
Howard	8c6e197f51	Remove allocation id from engine (#62680 ) We no longer need the allocation id in Engine.	2020-09-30 15:28:27 -04:00
Jay Modi	cb1dc5260f	Dedicated threadpool for system index writes (#62792 ) This commit adds a dedicated threadpool for system index write operations. The dedicated resources for system index writes serves as a means to ensure that user activity does not block important system operations from occurring such as the management of users and roles. Backport of #61655	2020-09-22 15:31:38 -06:00
Lee Hinman	4a08928c47	[7.x] Add index.routing.allocation.include._tier_preference setting (#62589 ) (#62667 ) This commit adds the `index.routing.allocation.prefer._tier` setting to the `DataTierAllocationDecider`. This special-purpose allocation setting lets a user specify a preference-based list of tiers for an index to be assigned to. For example, if the setting were set to: ``` "index.routing.allocation.prefer._tier": "data_hot,data_warm,data_content" ``` If the cluster contains any nodes with the `data_hot` role, the decider will only allow them to be allocated on the `data_hot` node(s). If there are no `data_hot` nodes, but there are `data_warm` and `data_content` nodes, then the index will be allowed to be allocated on `data_warm` nodes. This allows us to specify an index's preference for tier(s) without causing the index to be unassigned if no nodes of a preferred tier are available. Subsequent work will change the ILM migration to make additional use of this setting. Relates to #60848	2020-09-18 15:41:36 -06:00
Martijn van Groningen	5f643433c6	Prohibit the usage of create index api in namespaces managed by data stream templates (#62574 ) Backport of #62527 to 7.x branch. This commit adds validation that prohibits the creation of regular indices in the namespace of templates with data streams enabled. It shouldn't be possible to create ordinary indices when the name of the index matches with a composable index template that enables data streams. Auto creation has logic that creates data streams instead of regular indices. However validation logic for the create index api was missing.	2020-09-17 20:10:42 +02:00
Nhat Nguyen	87c889f9c9	CCR should retry on CircuitBreakingException (#62013 ) CCR shard follow task can hit CircuitBreakingException on the leader cluster (read changes requests) or the follower cluster (bulk requests). CCR should retry on CircuitBreakingException as it's a transient error.	2020-09-10 17:23:47 -04:00
Jake Landis	d8dad9ab2c	[7.x] Remove integTest task from PluginBuildPlugin (#61879 ) (#62135 ) This commit removes `integTest` task from all es-plugins. Most relevant projects have been converted to use yamlRestTest, javaRestTest, or internalClusterTest in prior PRs. A few projects needed to be adjusted to allow complete removal of this task * x-pack/plugin - converted to use yamlRestTest and javaRestTest * plugins/repository-hdfs - kept the integTest task, but use `rest-test` plugin to define the task * qa/die-with-dignity - convert to javaRestTest * x-pack/qa/security-example-spi-extension - convert to javaRestTest * multiple projects - remove the integTest.enabled = false (yay!) related: #61802 related: #60630 related: #59444 related: #59089 related: #56841 related: #59939 related: #55896	2020-09-09 14:25:41 -05:00
Jake Landis	794aac717d	[7.x] Convert first 1/2 x-pack plugins from integTest to [yaml \| java]RestTest or internalClusterTest (#60630 ) (#61855 ) For 1/2 the plugins in x-pack, the integTest task is now a no-op and all of the tests are now executed via a test, yamlRestTest, javaRestTest, or internalClusterTest. This includes the following projects: async-search, autoscaling, ccr, enrich, eql, frozen-indicies, data-streams, graph, ilm, mapper-constant-keyword, mapper-flattened, ml A few of the more specialized qa projects within these plugins have not been changed with this PR due to additional complexity which should be addressed separately. A follow up PR will address the remaining x-pack plugins (this PR is big enough as-is). related: #61802 related: #56841 related: #59939 related: #55896	2020-09-02 11:19:24 -05:00
Nhat Nguyen	e37ce561c7	Set timeout of auto put-follow request to unbounded (#61679 ) If the master node of the follower cluster is busy, then the auto-follower will fail to initialize the following process. This also occurs when an auto-follow pattern matches multiple indices. We should set the timeout of put-follow requests issued by the auto-follower to unbounded to avoid this problem. Closes #56891	2020-08-31 09:58:19 -04:00
Lee Hinman	1bfebd54ea	[7.x] Allocate newly created indices on data_hot tier nodes (#61342 ) (#61650 ) This commit adds the functionality to allocate newly created indices on nodes in the "hot" tier by default when they are created. This does not break existing behavior, as nodes with the `data` role are considered to be part of the hot tier. Users that separate their deployments by using the `data_hot` (and `data_warm`, `data_cold`, `data_frozen`) roles will have their data allocated on the hot tier nodes now by default. This change is a little more complicated than changing the default value for `index.routing.allocation.include._tier` from null to "data_hot". Instead, this adds the ability to have a plugin inject a setting into the builder for a newly created index. This has the benefit of allowing this setting to be visible as part of the settings when retrieving the index, for example: ``` // Create an index PUT /eggplant // Get an index GET /eggplant?flat_settings ``` Returns the default settings now of: ```json { "eggplant" : { "aliases" : { }, "mappings" : { }, "settings" : { "index.creation_date" : "1597855465598", "index.number_of_replicas" : "1", "index.number_of_shards" : "1", "index.provided_name" : "eggplant", "index.routing.allocation.include._tier" : "data_hot", "index.uuid" : "6ySG78s9RWGystRipoBFCA", "index.version.created" : "8000099" } } } ``` After the initial setting of this setting, it can be treated like any other index level setting. This new setting is not set on a new index if any of the following is true: - The index is created with an `index.routing.allocation.include.<anything>` setting - The index is created with an `index.routing.allocation.exclude.<anything>` setting - The index is created with an `index.routing.allocation.require.<anything>` setting - The index is created with a null `index.routing.allocation.include._tier` value - The index was created from an existing source metadata (shrink, clone, split, etc) Relates to #60848	2020-08-27 13:41:12 -06:00
Armin Braun	f22ddf822e	Some Optimizations around BytesArray (#61183 ) (#61511 ) * Faster `equals` for `BytesArray` which is nice since with this change we use it for the search cache * Lighter `StreamInput` for `BytesArray` that should save memory and some indirection relative to the one on the abstract bytes reference * Lighter `writeTo` implementation * Build a `BytesArray` instead of a PagedBytesReference whenever possible to save indirection and memory	2020-08-25 07:13:39 +02:00
Nhat Nguyen	328c86a4ec	Increase timeout in PrimaryFollowerAllocationIT A slow CI can take more than 10 seconds to relocate shards on the follower.	2020-08-13 14:41:32 -04:00
Nhat Nguyen	ceaa28e97b	Increase timeout in testFollowIndexWithConcurrentMappingChanges (#60534 ) The test failed because the leader was taking a lot of CPUs to process many mapping updates. This commit reduces the mapping updates, increases timeout, and adds more debug info. Closes #59832	2020-08-11 17:03:22 -04:00
Nhat Nguyen	bf7eecf1dc	Fix synchronization in ShardFollowNodeTask (#60490 ) The leader mapping, settings, and aliases versions in a shard follow-task are updated without proper synchronization and can go backward.	2020-08-11 14:52:52 -04:00
Rene Groeschke	bdd7347bbf	Merge test runner task into RestIntegTest (7.x backport) (#60600 ) * Merge test runner task into RestIntegTest (#60261) * Merge test runner task into RestIntegTest * Reorganizing Standalone runner and RestIntegTest task * Rework general test task configuration and extension * Fix merge issues * use former 7.x common test configuration	2020-08-04 14:46:32 +02:00
Jake Landis	bcb9d06bb6	[7.x] Cleanup xpack build.gradle (#60554 ) (#60603 ) This commit does three things: * Removes all Copyright/license headers for the build.gradle files under x-pack. (implicit Apache license) * Removes evaluationDependsOn(xpackModule('core')) from build.gradle files under x-pack * Removes a place holder test in favor of disabling the test task (in the async plugin)	2020-08-03 13:11:43 -05:00
Rene Groeschke	ed4b70190b	Replace immediate task creations by using task avoidance api (#60071 ) (#60504 ) - Replace immediate task creations by using task avoidance api - One step closer to #56610 - Still many tasks are created during configuration phase. Tackled in separate steps	2020-07-31 13:09:04 +02:00
Nhat Nguyen	9d4a64e749	Allow CCR on nodes with legacy roles only (#60093 ) CCR will stop functioning if the master node is on 7.8, but data nodes are before that version because the master node considers that all data nodes do not have the remote cluster client role. This commit allows CCR work on data nodes with legacy roles only. Relates #54146 Relates #59375	2020-07-29 10:57:31 -04:00
Nhat Nguyen	416e51980c	Relax ShardFollowTasksExecutor validation (#60054 ) If a primary shard of a follower index is being relocated, then we will fail to create a follow-task. This validation is too restricted. We should ensure that all primaries of the follower index are active instead. Closes #59625	2020-07-28 13:46:49 -04:00
Nhat Nguyen	6ece629ec3	Set timeout of master requests on follower to unbounded (#60070 ) Today, a follow task will fail if the master node of the follower cluster is temporarily overloaded and unable to process master node requests (such as update mapping, setting, or alias) from a follow-task within the default timeout. This error is transient, and follow-tasks should not abort. We can avoid this problem by setting the timeout of master node requests on the follower cluster to unbounded. Closes #56891	2020-07-28 13:46:49 -04:00
Nhat Nguyen	bc65b3a590	Increase timeout in AutoFollowIT (#60004 ) It can take more than 10 seconds to auto-follow and create a follow-task on a slow CI. This commit increases timeout in AutoFollowIT by replacing assertBusy with assertLongBusy. Closes #59952	2020-07-23 16:36:53 -04:00
Nhat Nguyen	0fe4d5df67	Increase timeout testFollowIndexWithConcurrentMappingChanges Fixes #59273	2020-07-23 16:22:58 -04:00
Yannick Welsch	07784a0b16	CCR recoveries using wrong setting for chunk sizes (#59597 ) The default chunk size for CCR file-based recoveries was wrongly set to 40MB instead of 1MB.	2020-07-21 13:56:06 +02:00
Nhat Nguyen	b599f7a9c0	Fix estimate size of translog operations (#59206 ) Make sure that the estimateSize method includes all fields of translog operations.	2020-07-16 00:19:30 -04:00
Yannick Welsch	bc11503dc3	Wait for active license in CcrRestIT (#59543 ) Relates #53966 Closes #59486	2020-07-15 09:38:08 +02:00
Armin Braun	2dd086445c	Enable Fully Concurrent Snapshot Operations (#56911 ) (#59578 ) Enables fully concurrent snapshot operations: * Snapshot create- and delete operations can be started in any order * Delete operations wait for snapshot finalization to finish, are batched as much as possible to improve efficiency and once enqueued in the cluster state prevent new snapshots from starting on data nodes until executed * We could be even more concurrent here in a follow-up by interleaving deletes and snapshots on a per-shard level. I decided not to do this for now since it seemed not worth the added complexity yet. Due to batching+deduplicating of deletes the pain of having a delete stuck behind a long -running snapshot seemed manageable (dropped client connections + resulting retries don't cause issues due to deduplication of delete jobs, batching of deletes allows enqueuing more and more deletes even if a snapshot blocks for a long time that will all be executed in essentially constant time (due to bulk snapshot deletion, deleting multiple snapshots is mostly about as fast as deleting a single one)) * Snapshot creation is completely concurrent across shards, but per shard snapshots are linearized for each repository as are snapshot finalizations See updated JavaDoc and added test cases for more details and illustration on the functionality. Some notes: The queuing of snapshot finalizations and deletes and the related locking/synchronization is a little awkward in this version but can be much simplified with some refactoring. The problem is that snapshot finalizations resolve their listeners on the `SNAPSHOT` pool while deletes resolve the listener on the master update thread. With some refactoring both of these could be moved to the master update thread, effectively removing the need for any synchronization around the `SnapshotService` state. I didn't do this refactoring here because it's a fairly large change and not necessary for the functionality but plan to do so in a follow-up. This change allows for completely removing any trickery around synchronizing deletes and snapshots from SLM and 100% does away with SLM errors from collisions between deletes and snapshots. Snapshotting a single index in parallel to a long running full backup will execute without having to wait for the long running backup as required by the ILM/SLM use case of moving indices to "snapshot tier". Finalizations are linearized but ordered according to which snapshot saw all of its shards complete first	2020-07-15 03:42:31 +02:00
Armin Braun	e1014038e9	Simplify Repository.finalizeSnapshot Signature (#58834 ) (#59574 ) Many of the parameters we pass into this method were only used to build the `SnapshotInfo` instance to write. This change simplifies the signature. Also, it seems less error prone to build `SnapshotInfo` in `SnapshotsService` isntead of relying on the fact that each repository implementation will build the correct `SnapshotInfo`.	2020-07-15 00:14:28 +02:00
Armin Braun	d456f7870a	Deduplicate Index Metadata in BlobStore (#50278 ) (#59514 ) This PR introduces two new fields in to `RepositoryData` (index-N) to track the blob name of `IndexMetaData` blobs and their content via setting generations and uuids. This is used to deduplicate the `IndexMetaData` blobs (`meta-{uuid}.dat` in the indices folders under `/indices` so that new metadata for an index is only written to the repository during a snapshot if that same metadata can't be found in another snapshot. This saves one write per index in the common case of unchanged metadata thus saving cost and making snapshot finalization drastically faster if many indices are being snapshotted at the same time. The implementation is mostly analogous to that for shard generations in #46250 and piggy backs on the BwC mechanism introduced in that PR (which means this PR needs adjustments if it doesn't go into `7.6`). Relates to #45736 as it improves the efficiency of snapshotting unchanged indices Relates to #49800 as it has the potential of loading the index metadata for multiple snapshots of the same index concurrently much more efficient speeding up future concurrent snapshot delete	2020-07-14 22:18:42 +02:00
Tim Brooks	408a07f96a	Separate coordinating and primary bytes in stats (#59487 ) Currently we combine coordinating and primary bytes into a single bucket for indexing pressure stats. This makes sense for rejection logic. However, for metrics it would be useful to separate them.	2020-07-14 12:37:06 -06:00
Nhat Nguyen	4d7c59bedb	Assign follower primary to nodes with remote cluster client role (#59375 ) The primary shards of follower indices during the bootstrap need to be on nodes with the remote cluster client role as those nodes reach out to the corresponding leader shards on the remote cluster to copy Lucene segment files and renew the retention leases. This commit introduces a new allocation decider that ensures bootstrapping follower primaries are allocated to nodes with the remote cluster client role. Co-authored-by: Jason Tedor <jason@tedor.me>	2020-07-14 11:23:55 -04:00
Tim Brooks	623df95a32	Adding indexing pressure stats to node stats API (#59467 ) We have recently added internal metrics to monitor the amount of indexing occurring on a node. These metrics introduce back pressure to indexing when memory utilization is too high. This commit exposes these stats through the node stats API.	2020-07-13 17:23:42 -06:00
Daniel Mitterdorfer	daa48329ec	[TEST] Mute FollowerFailOverIT.testFailOverOnFollower (#58659 ) (#59286 ) Relates #58534 Co-authored-by: Dimitris Athanasiou <dimitris@elastic.co>	2020-07-09 12:38:36 +02:00
Nhat Nguyen	ef5c397c0f	Sending operations concurrently in peer recovery (#58018 ) Today, we send operations in phase2 of peer recoveries batch by batch sequentially. Normally that's okay as we should have a fairly small of operations in phase 2 due to the file-based threshold. However, if phase1 takes a lot of time and we are actively indexing, then phase2 can have a lot of operations to replay. With this change, we will send multiple batches concurrently (defaults to 1) to reduce the recovery time. Backport of #58018	2020-07-07 22:03:31 -04:00
Jake Landis	604c6dd528	7.x - Create plugin for yamlTest task (#56841 ) (#59090 ) This commit creates a new Gradle plugin to provide a separate task name and source set for running YAML based REST tests. The only project converted to use the new plugin in this PR is distribution/archives/integ-test-zip. For which the testing has been moved to :rest-api-spec since it makes the most sense and it avoids a small but awkward change to the distribution plugin. The remaining cases in modules, plugins, and x-pack will be handled in followups. This plugin is distinctly different from the plugin introduced in #55896 since the YAML REST tests are intended to be black box tests over HTTP. As such they should not (by default) have access to the classpath for that which they are testing. The YAML based REST tests will be moved to separate source sets (yamlRestTest). The which source is the target for the test resources is dependent on if this new plugin is applied. If it is not applied, it will default to the test source set. Further, this introduces a breaking change for plugin developers that use the YAML testing framework. They will now need to either use the new source set and matching task, or configure the rest resources to use the old "test" source set that matches the old integTest task. (The former should be preferred). As part of this change (which is also breaking for plugin developers) the rest resources plugin has been removed from the build plugin and now requires either explicit application or application via the new YAML REST test plugin. Plugin developers should be able to fix the breaking changes to the YAML tests by adding apply plugin: 'elasticsearch.yaml-rest-test' and moving the YAML tests under a yamlRestTest folder (instead of test)	2020-07-06 14:16:26 -05:00
Dan Hermann	c1781bc7e7	[7.x] Add include_data_streams flag for authorization (#59008 )	2020-07-03 12:58:39 -05:00
Tim Brooks	dc9e364ff2	Count coordinating and primary bytes as write bytes (#58984 ) This is a follow-up to #57573. This commit combines coordinating and primary bytes under the same "write" bucket. Double accounting is prevented by only accounting the bytes at either the reroute phase or the primary phase. TransportBulkAction calls execute directly, so the operations handler is skipped and the bytes are not double accounted.	2020-07-02 19:48:19 -06:00
Tim Brooks	1ef2cd7f1a	Add memory tracking to queued write operations (#58957 ) Currently we do not track the memory consuming by in-process write operations. This commit adds a mechanism to track write operation memory usage.	2020-07-02 14:14:57 -06:00
Nhat Nguyen	f63cbad629	Ensure CCR partial reads never overuse buffer (#58620 ) When the documents are large, a follower can receive a partial response because the requesting range of operations is capped by max_read_request_size instead of max_read_request_operation_count. In this case, the follower will continue reading the subsequent ranges without checking the remaining size of the buffer. The buffer then can use more memory than max_write_buffer_size and even causes OOM. Backport of #58620	2020-07-01 13:23:28 -04:00
Ryan Ernst	c23613e05a	Split license allowed checks into two types (#58704 ) (#58797 ) The checks on the license state have a singular method, isAllowed, that returns whether the given feature is allowed by the current license. However, there are two classes of usages, one which intends to actually use a feature, and another that intends to return in telemetry whether the feature is allowed. When feature usage tracking is added, the latter case should not count as a "usage", so this commit reworks the calls to isAllowed into 2 methods, checkFeature, which will (eventually) both check whether a feature is allowed, and keep track of the last usage time, and isAllowed, which simply determines whether the feature is allowed. Note that I considered having a boolean flag on the current method, but wanted the additional clarity that a different method name provides, versus a boolean flag which is more easily copied without realizing what the flag means since it is nameless in call sites.	2020-07-01 07:11:05 -07:00
Yannick Welsch	15c85b29fd	Account for recovery throttling when restoring snapshot (#58658 ) (#58811 ) Restoring from a snapshot (which is a particular form of recovery) does not currently take recovery throttling into account (i.e. the `indices.recovery.max_bytes_per_sec` setting). While restores are subject to their own throttling (repository setting `max_restore_bytes_per_sec`), this repository setting does not allow for values to be configured differently on a per-node basis. As restores are very similar in nature to peer recoveries (streaming bytes to the node), it makes sense to configure throttling in a single place. The `max_restore_bytes_per_sec` setting is also changed to default to unlimited now, whereas previously it was set to `40mb`, which is the current default of `indices.recovery.max_bytes_per_sec`). This means that no behavioral change will be observed by clusters where the recovery and restore settings were not adapted. Relates https://github.com/elastic/elasticsearch/issues/57023 Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-07-01 12:19:29 +02:00
David Turner	3a234d2669	Account for remaining recovery in disk allocator (#58800 ) Today the disk-based shard allocator accounts for incoming shards by subtracting the estimated size of the incoming shard from the free space on the node. This is an overly conservative estimate if the incoming shard has almost finished its recovery since in that case it is already consuming most of the disk space it needs. This change adds to the shard stats a measure of how much larger each store is expected to grow, computed from the ongoing recovery, and uses this to account for the disk usage of incoming shards more accurately. Backport of #58029 to 7.x * Picky picky * Missing type	2020-07-01 10:12:44 +01:00
Rene Groeschke	d952b101e6	Replace compile configuration usage with api (7.x backport) (#58721 ) * Replace compile configuration usage with api (#58451) - Use java-library instead of plugin to allow api configuration usage - Remove explicit references to runtime configurations in dependency declarations - Make test runtime classpath input for testing convention - required as java library will by default not have build jar file - jar file is now explicit input of the task and gradle will ensure its properly build * Fix compile usages in 7.x branch	2020-06-30 15:57:41 +02:00

1 2 3 4 5 ...

693 Commits