OpenSearch

Commit Graph

Author	SHA1	Message	Date
Martijn van Groningen	88f4b0a326	Do not set fatal exception when shard follow task is stopped. (#37603 ) When shard follow task is cancelled while fetching operations then the fatal exception field should not be set.	2019-01-21 07:54:51 +01:00
Tim Brooks	fe753ee1d2	Do not add index event listener if CCR disabled (#37432 ) Currently we add the CcrRestoreSourceService as a index event listener. However, if ccr is disabled, this service is null and we attempt to add a null listener throwing an exception. This commit only adds the listener if ccr is enabled.	2019-01-18 16:31:21 -07:00
Tim Brooks	cd41289396	Add local session timeouts to leader node (#37438 ) This is related to #35975. This commit adds timeout functionality to the local session on a leader node. When a session is started, a timeout is scheduled using a repeatable runnable. If the session is not accessed in between two runs the session is closed. When the sssion is closed, the repeating task is cancelled. Additionally, this commit moves session uuid generation to the leader cluster. And renames the PutCcrRestoreSessionRequest to StartCcrRestoreSessionRequest to reflect that change.	2019-01-18 14:48:20 -07:00
Martijn van Groningen	6846666b6b	Add ccr follow info api (#37408 ) * Add ccr follow info api This api returns all follower indices and per follower index the provided parameters at put follow / resume follow time and whether index following is paused or active. Closes #37127 * iter * [DOCS] Edits the get follower info API * [DOCS] Fixes link to remote cluster * [DOCS] Clarifies descriptions for configured parameters	2019-01-18 16:37:21 +01:00
Tim Brooks	978c818d0f	Use RestoreSnapshotRequest in CcrRepositoryIT Commit #37535 removed an internal restore request in favor of the RestoreSnapshotRequest. Commit #37449 added a new test that used the internal restore request. This commit modifies the new test to use the RestoreSnapshotRequest.	2019-01-17 15:31:27 -07:00
Tim Brooks	b6f06a48c0	Implement follower rate limiting for file restore (#37449 ) This is related to #35975. This commit implements rate limiting on the follower side using a new class `CombinedRateLimiter`.	2019-01-17 14:58:46 -07:00
Armin Braun	381d035cd6	Remove Redundant RestoreRequest Class (#37535 ) * Same as #37464 but for the restore side	2019-01-17 22:23:23 +01:00
Martijn van Groningen	b85bfd3e17	Added fatal_exception field for ccr stats in monitoring mapping. (#37563 )	2019-01-17 14:04:41 +01:00
Martijn van Groningen	99b09845da	Moved ccr integration to the package with other ccr integration tests.	2019-01-17 13:57:56 +01:00
Przemyslaw Gomulka	5e94f384c4	Remove the use of AbstracLifecycleComponent constructor #37488 (#37488 ) The AbstracLifecycleComponent used to extend AbstractComponent, so it had to pass settings to the constractor of its supper class. It no longer extends the AbstractComponent so there is no need for this constructor There is also no need for AbstracLifecycleComponent subclasses to have Settings in their constructors if they were only passing it over to super constructor. This is part 1. which will be backported to 6.x with a migration guide/deprecation log. part 2 will have this constructor removed in 7 relates #35560 relates #34488	2019-01-16 09:05:30 +01:00
Martijn van Groningen	9554b2fecb	When removing an AutoFollower also mark it as removed. (#37402 ) Currently when there are no more auto follow patterns for a remote cluster then the AutoFollower instance for this remote cluster will be removed. If a new auto follow pattern for this remote cluster gets added quickly enough after the last delete then there may be two AutoFollower instance running for this remote cluster instead of one. Each AutoFollower instance stops automatically after it sees in the start() method that there are no more auto follow patterns for the remote cluster it is tracking. However when an auto follow pattern gets removed and then added back quickly enough then old AutoFollower may never detect that at some point there were no auto follow patterns for the remote cluster it is monitoring. The creation and removal of an AutoFollower instance happens independently in the `updateAutoFollowers()` as part of a cluster state update. By adding the `removed` field, an AutoFollower instance will not miss the fact there were no auto follow patterns at some point in time. The `updateAutoFollowers()` method now marks an AutoFollower instance as removed when it sees that there are no more patterns for a remote cluster. The updateAutoFollowers() method can then safely start a new AutoFollower instance. Relates to #36761	2019-01-15 16:24:19 +01:00
Julie Tibshirani	36a3b84fc9	Update the default for include_type_name to false. (#37285 ) * Default include_type_name to false for get and put mappings. * Default include_type_name to false for get field mappings. * Add a constant for the default include_type_name value. * Default include_type_name to false for get and put index templates. * Default include_type_name to false for create index. * Update create index calls in REST documentation to use include_type_name=true. * Some minor clean-ups around the get index API. * In REST tests, use include_type_name=true by default for index creation. * Make sure to use 'expression == false'. * Clarify the different IndexTemplateMetaData toXContent methods. * Fix FullClusterRestartIT#testSnapshotRestore. * Fix the ml_anomalies_default_mappings test. * Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests. We make sure to specify include_type_name=true during xContent parsing, so we continue to test the legacy typed responses. XContent generation for the typeless responses is currently only covered by REST tests, but we will be adding unit test coverage for these as we implement each typeless API in the Java HLRC. This commit also refactors GetMappingsResponse to follow the same appraoch as the other mappings-related responses, where we read include_type_name out of the xContent params, instead of creating a second toXContent method. This gives better consistency in the response parsing code. * Fix more REST tests. * Improve some wording in the create index documentation. * Add a note about types removal in the create index docs. * Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL. * Make sure to mention include_type_name in the REST docs for affected APIs. * Make sure to use 'expression == false' in FullClusterRestartIT. * Mention include_type_name in the REST templates docs.	2019-01-14 13:08:01 -08:00
Tim Brooks	5c68338a1c	Implement ccr file restore (#37130 ) This is related to #35975. It implements a file based restore in the CcrRepository. The restore transfers files from the leader cluster to the follower cluster. It does not implement any advanced resiliency features at the moment. Any request failure will end the restore.	2019-01-14 13:07:55 -07:00
Martijn van Groningen	de852765d6	unmuted test Relates to #37014	2019-01-14 14:27:42 +01:00
Martijn van Groningen	e4391afd98	Test fix, wait for auto follower to have stopped in the background Relates to #36761	2019-01-11 17:26:17 +01:00
Martijn van Groningen	6d81e7c3e7	[CCR] FollowingEngine should fail with 403 if operation has no seqno assigned (#37213 ) Fail with a 403 when indexing a document directly into a follower index. In order to test this change, I had to move specific assertions into a dedicated class and disable assertions for that class in the rest qa module. I think that is the right trade off.	2019-01-10 15:54:34 +01:00
Martijn van Groningen	df488720e0	[CCR] Make shard follow tasks more resilient for restarts (#37239 ) If a running shard follow task needs to be restarted and the remote connection seeds have changed then a shard follow task currently fails with a fatal error. The change creates the remote client lazily and adjusts the errors a shard follow task should retry. This issue was found in test failures in the recently added ccr rolling upgrade tests. The reason why this issue occurs more frequently in the rolling upgrade test is because ccr is setup in local mode (so remote connection seed will become stale) and all nodes are restarted, which forces the shard follow tasks to get restarted at some point during the test. Note that these tests cannot be enabled yet, because this change will need to be backported to 6.x first. (otherwise the issue still occurs on non upgraded nodes) I also changed the RestartIndexFollowingIT to setup remote cluster via persistent settings and to also restart the leader cluster. This way what happens during the ccr rolling upgrade qa tests, also happens in this test. Relates to #37231	2019-01-10 15:02:30 +01:00
Martijn van Groningen	1a41d84536	[CCR] Resume follow Api should not require a request body (#37217 ) Closes #37022	2019-01-10 09:48:26 +01:00
Martijn van Groningen	9122585359	[CCR] Added more logging.	2019-01-09 12:17:47 +01:00
Alpar Torok	6344e9a3ce	Testing conventions: add support for checking base classes (#36650 )	2019-01-08 13:39:03 +02:00
Jason Tedor	c8c596cead	Introduce retention lease expiration (#37195 ) This commit implements a straightforward approach to retention lease expiration. Namely, we inspect which leases are expired when obtaining the current leases through the replication tracker. At that moment, we clean the map that persists the retention leases in memory.	2019-01-07 22:03:52 -08:00
Jason Tedor	c0f8c89172	Introduce shard history retention leases (#37167 ) This commit is the first in a series which will culminate with fully-functional shard history retention leases. Shard history retention leases are aimed at preventing shard history consumers from having to fallback to expensive file copy operations if shard history is not available from a certain point. These consumers include following indices in cross-cluster replication, and local shard recoveries. A future consumer will be the changes API. Further, index lifecycle management requires coordinating with some of these consumers otherwise it could remove the source before all consumers have finished reading all operations. The notion of shard history retention leases that we are introducing here will also be used to address this problem. Shard history retention leases are a property of the replication group managed under the authority of the primary. A shard history retention lease is a combination of an identifier, a retaining sequence number, a timestamp indicating when the lease was acquired or renewed, and a string indicating the source of the lease. Being leases they have a limited lifespan that will expire if not renewed. The idea of these leases is that all operations above the minimum of all retaining sequence numbers will be retained during merges (which would otherwise clear away operations that are soft deleted). These leases will be periodically persisted to Lucene and restored during recovery, and broadcast to replicas under certain circumstances. This commit is merely putting the basics in place. This first commit only introduces the concept and integrates their use with the soft delete retention policy. We add some tests to demonstrate the basic management is correct, and that the soft delete policy is correctly influenced by the existence of any retention leases. We make no effort in this commit to implement any of the following: - timestamps - expiration - persistence to and recovery from Lucene - handoff during primary relocation - sharing retention leases with replicas - exposing leases in shard-level statistics - integration with cross-cluster replication These will occur individually in follow-up commits.	2019-01-07 07:43:57 -08:00
Jim Ferenczi	e38cf1d0dc	Add the ability to set the number of hits to track accurately (#36357 ) In Lucene 8 searches can skip non-competitive hits if the total hit count is not requested. It is also possible to track the number of hits up to a certain threshold. This is a trade off to speed up searches while still being able to know a lower bound of the total hit count. This change adds the ability to set this threshold directly in the track_total_hits search option. A boolean value (true, false) indicates whether the total hit count should be tracked in the response. When set as an integer this option allows to compute a lower bound of the total hits while preserving the ability to skip non-competitive hits when enough matches have been collected. Relates #33028	2019-01-04 20:36:49 +01:00
Luca Cavanna	c1beb95aa1	Mute LocalIndexFollowingIT#testRemoveRemoteConnection Relates to #37014	2018-12-28 16:39:36 +01:00
Nhat Nguyen	7580d9d925	Make SourceToParse immutable (#36971 ) Today the routing of a SourceToParse is assigned in a separate step after the object is created. We can easily forget to set the routing. With this commit, the routing must be provided in the constructor of SourceToParse. Relates #36921	2018-12-24 14:06:50 -05:00
Martijn van Groningen	561b704129	[CCR] AutoFollowCoordinator and follower index already created (#36540 ) The AutoFollowCoordinator should be resilient to the fact that the follower index has already been created and in that case it should only update the auto follow metadata with the fact that the follower index was created. Relates to #33007	2018-12-24 10:16:38 +01:00
Martijn van Groningen	44fe265d82	[CCR] Added auto_follow_exception.timestamp field to auto follow stats (#36947 ) Currently auto follow stats users are unable to see whether an auto follow error was recent or old. The new timestamp field will help user distinguish between old and new errors.	2018-12-24 07:53:51 +01:00
Martijn van Groningen	4fb62fcba6	Make CCR resilient against missing remote cluster connections (#36682 ) Both index following and auto following should be resilient against missing remote connections. This happens in the case that they get accidentally removed by a user. When this happens auto following and index following will retry to continue instead of failing with unrecoverable exceptions. Both the put follow and put auto follow APIs validate whether the remote cluster connection. The logic added in this change only exists in case during the lifetime of a follower index or auto follow pattern the remote connection gets removed. This retry behavior similar how CCR deals with authorization errors. Closes #36667 Closes #36255	2018-12-24 07:28:34 +01:00
Martijn van Groningen	4ded4717fe	[CCR] Add `ccr.auto_follow_coordinator.wait_for_timeout` setting (#36714 ) This setting controls the wait for timeout the autofollow coordinator should use when setting cluster state requests to a remote cluster.	2018-12-21 09:36:40 +01:00
Tim Brooks	d9b2ed6135	Send clear session as routable remote request (#36805 ) This commit adds a RemoteClusterAwareRequest interface that allows a request to specify which remote node it should be routed to. The remote cluster aware client will attempt to route the request directly to this node. Otherwise it will send it as a proxy action to eventually end up on the requested node. It implements the ccr clean_session action with this client.	2018-12-20 17:43:12 -07:00
Tim Brooks	4cd570593d	Update index mappings when ccr restore complete (#36879 ) This is related to #35975. When the shard restore process is complete, the index mappings need to be updated to ensure that the data in the files restores is compatible with the follower mappings. This commit implements a mapping update as the final step in a shard restore.	2018-12-20 13:53:04 -07:00
Martijn van Groningen	b42074c1cc	[CCR] Report error if auto follower tries auto follow a leader index with soft deletes disabled (#36886 ) Currently if a leader index with soft deletes disabled is auto followed then this index is silently ignored. This commit changes this behavior to mark these indices as auto followed and report an error, which is visible in auto follow stats. Marking the index as auto follow is important, because otherwise the auto follower will continuously try to auto follow and fail. Relates to #33007	2018-12-20 15:21:52 +01:00
Martijn van Groningen	7b1dfeff2e	Renamed `WHITE_LISTED_SETTINGS` to `NON_REPLICATED_SETTINGS` because the latter better describes the purpose of this field.	2018-12-20 15:08:04 +01:00
Martijn van Groningen	18691daebe	[TEST] Renamed ccr qa module.	2018-12-19 13:57:12 +01:00
Martijn van Groningen	3cc0cf03c6	[TEST] No need to specifically check licensesMetaData on master node.	2018-12-19 13:51:24 +01:00
Martijn van Groningen	a6af33ef0b	[TEST] Wait for license metadata to be installed	2018-12-19 13:03:45 +01:00
Alpar Torok	e9ef5bdce8	Converting randomized testing to create a separate unitTest task instead of replacing the builtin test task (#36311 ) - Create a separate unitTest task instead of Gradle's built in - convert all configuration to use the new task - the built in task is now disabled	2018-12-19 08:25:20 +02:00
Tim Brooks	1fa105658e	Add CcrRestoreSourceService to track sessions (#36578 ) This commit is related to #36127. It adds a CcrRestoreSourceService to track Engine.IndexCommitRef need for in-process file restores. When a follower starts restoring a shard through the CcrRepository it opens a session with the leader through the PutCcrRestoreSessionAction. The leader responds to the request by telling the follower what files it needs to fetch for a restore. This is not yet implemented. Once, the restore is complete, the follower closes the session with the DeleteCcrRestoreSessionAction action.	2018-12-18 11:23:13 -07:00
Martijn van Groningen	1afcfc97bd	[TEST] Added more logging Relates to #36761	2018-12-18 16:01:02 +01:00
Boaz Leskes	5f76f39386	Rename seq# powered optimistic concurrency control parameters to ifSeqNo/ifPrimaryTerm (#36757 ) This PR renames the parameters previously introduce to the following: ### URL Parameters ``` PUT twitter/_doc/1?if_seq_no=501&if_primary_term=1 { "user" : "kimchy", "post_date" : "2009-11-15T14:12:12", "message" : "trying out Elasticsearch" } DELETE twitter/_doc/1?if_seq_no=501&if_primary_term=1 ``` ### Bulk API ``` POST _bulk { "index" : { "_index" : "test", "_type" : "_doc", "_id" : "1", "if_seq_no": 501, "if_primary_term": 1 } } { "field1" : "value1" } { "delete" : { "_index" : "test", "_type" : "_doc", "_id" : "2", "if_seq_no": 501, "if_primary_term": 1 } } ``` ### Java API ``` IndexRequest.ifSeqNo(long seqNo) IndexRequest.ifPrimaryTerm(long primaryTerm) DeleteRequest.ifSeqNo(long seqNo) DeleteRequest.ifPrimaryTerm(long primaryTerm) ``` Relates #36148 Relates #10708	2018-12-18 14:35:18 +01:00
Martijn van Groningen	0ff1f1fa18	Muted tests. Relates to #36764	2018-12-18 13:39:01 +01:00
Martijn van Groningen	57e1a4bc9f	[TEST] Ensure shard follow tasks have really stopped. Relates to #36696	2018-12-18 10:43:27 +01:00
Tim Brooks	3dd5a5a3c5	Initialize startup `CcrRepositories` (#36730 ) Currently, the CcrRepositoryManger only listens for settings updates and installs new repositories. It does not install the repositories that are in the initial settings. This commit, modifies the manager to install the initial repositories. Additionally, it modifies the ccr integration test to configure the remote leader node at startup, instead of using a settings update.	2018-12-17 13:19:32 -07:00
Martijn van Groningen	a181a25226	[CCR] Add time since last auto follow fetch to auto follow stats (#36542 ) For each remote cluster the auto follow coordinator, starts an auto follower that checks the remote cluster state and determines whether an index needs to be auto followed. The time since last auto follow is reported per remote cluster and gives insight whether the auto follow process is alive. Relates to #33007 Originates from #35895	2018-12-17 14:14:56 +01:00
Martijn van Groningen	f27d2c2927	[TEST] Pause index following at end of test, so that no unexpected failures happen at test teardown.	2018-12-17 07:55:27 +01:00
Nhat Nguyen	2028c2af14	TEST: Do not assert max_seq_of_updates if promotion If a primary promotion happens in the test testAddRemoveShardOnLeader, the max_seq_no_of_updates_or_deletes on a new primary might be higher than the max_seq_no_of_updates_or_deletes on the replicas or copies of the follower. Relates #36607	2018-12-16 16:48:04 -05:00
Martijn van Groningen	97107e99e8	Moved test to its rightful place.	2018-12-16 13:57:51 +01:00
Boaz Leskes	733a6d34c1	Add seq no powered optimistic locking support to the index and delete transport actions (#36619 ) This commit add support for using sequence numbers to power [optimistic concurrency control](http://en.wikipedia.org/wiki/Optimistic_concurrency_control) in the delete and index transport actions and requests. A follow up will come with adding sequence numbers to the update and get results. Relates #36148 Relates #10708	2018-12-15 17:59:57 +01:00
Albert Zaharovits	a30e8c2fa3	HasPrivilegesResponse use TreeSet for fields (#36329 ) For class fields of type collection whose order is not important and for which duplicates are not permitted we declare them as `Set`s. Usually the definition is a `HashSet` but in this case `TreeSet` is used instead to aid testing.	2018-12-15 08:34:54 +02:00
Martijn van Groningen	68a674ef1f	[CCR] Fix follow stats API's follower index filtering feature (#36647 ) Currently always all follow stats for all follower indices are being returned even if follow stats for only specific indices are requested.	2018-12-14 19:39:30 +01:00

1 2 3 4 5 ...

335 Commits