OpenSearch

Commit Graph

Author	SHA1	Message	Date
Jason Tedor	73417bf09a	Move CCR REST tests to a sub-project of ccr This commit moves these REST tests (possibly temporarily) to a sub-project of ccr. We do this (again, possibly temporarily) to keep them within the ccr sub-project yet there are changes within 6.x that prevent these from being in the top-level project (the cluster formation tasks are trying to install x-pack-ccr into the integ-test-zip). Therefore, we isolate these for now until we can understand why there are differences between 6.x and master.	2018-09-15 10:18:59 -04:00
Jason Tedor	aa56892f2f	Move CCR REST tests to ccr sub-project (#33731 ) This commit moves the CCR REST tests to the ccr sub-project as another step towards running :x-pack:plugin:ccr:check giving us full coverage on CCR.	2018-09-15 09:18:15 -04:00
Jason Tedor	f037edb8e3	Move CCR monitoring tests to ccr sub-project (#33730 ) This commit moves the CCR monitoring tests from the monitoring sub-project to the ccr sub-project.	2018-09-15 09:16:33 -04:00
Martijn van Groningen	82a6ae1dae	[CCR] Move ccr tests in core module back to ccr module (#33711 ) When developing ccr it is not ideal if tests are in multiple modules. Even the classes these tests test are in the core module, it is easier if these tests are in ccr module in order to avoid running the test task in core module. This results in running many non ccr tests. This way when developing ccr we can run locally: ./gradlew x-pack:plugin:core:precommit x-pack:plugin:ccr:check before pushing to PR branches and be confident that the PR build passes, without running x-pack:plugin:core:check task.	2018-09-14 17:18:00 +02:00
Jason Tedor	2282150f34	Expose retries for CCR fetch failures (#33694 ) This commit exposes the number of times that a fetch has been tried to the CCR stats endpoint, and to CCR monitoring.	2018-09-14 08:52:46 -04:00
Martijn van Groningen	222f42274e	[CCR] Check whether the rejected execution exception has the shutdown flag set (#33703 ) and if so debug log it and otherwise rethrow. This should fix a couple of test failures where during test teardown tests failed due to uncaught exceptions being detected.	2018-09-14 13:28:11 +02:00
Martijn van Groningen	4bcad95fe7	[TEST] wait for no initializing shards	2018-09-14 09:59:24 +02:00
Martijn van Groningen	53ba253aa4	[CCR] Add validation for max_retry_delay (#33648 )	2018-09-13 20:52:00 +02:00
Martijn van Groningen	a69ae6b89f	[CCR] Add metadata to keep track of the index uuid of the leader index in the follow index (#33367 ) The follow index api checks if the recorded uuid in the follow index matches with uuid of the leader index and fails otherwise. This validation will prevent a follow index from following an incompatible leader index. The create_and_follow api will automatically add this custom index metadata when it creates the follow index. Closes #31505	2018-09-13 11:36:52 +02:00
Jason Tedor	eb715d5290	Add follower index to CCR monitoring and status (#33645 ) This commit adds the follower index to CCR shard follow task status, and to monitoring.	2018-09-12 17:35:06 -04:00
Martijn van Groningen	b5d8495789	[CCR] Add auto follow pattern APIs to transport client. (#33629 )	2018-09-12 21:50:22 +02:00
Martijn van Groningen	5fa81310cc	[CCR] Added history uuid validation (#33546 ) For correctness we need to verify whether the history uuid of the leader index shards never changes while that index is being followed. * The history UUIDs are recorded as custom index metadata in the follow index. * The follow api validates whether the current history UUIDs of the leader index shards are the same as the recorded history UUIDs. If not the follow api fails. * While a follow index is following a leader index; shard follow tasks on each shard changes api call verify whether their current history uuid is the same as the recorded history uuid. Relates to #30086 Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>	2018-09-12 19:42:00 +02:00
Martijn van Groningen	901d8035d9	[CCR] Update es monitoring mapping and (#33635 ) * [CCR] Update es monitoring mapping and change qa tests to query based on leader index. Co-authored-by: Jason Tedor <jason@tedor.me>	2018-09-12 19:36:17 +02:00
Tanguy Leroux	bcac7f5e55	Fix checkstyle violation in ShardFollowNodeTask	2018-09-12 16:03:52 +02:00
Jason Tedor	23f12e42c1	Expose CCR stats to monitoring (#33617 ) This commit exposes the CCR stats endpoint to monitoring collection. Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>	2018-09-12 09:13:07 -04:00
Martijn van Groningen	96c49e5ed0	[CCR] Improve shard follow task's retryable error handling (#33371 ) Improve failure handling of retryable errors by retrying remote calls in a exponential backoff like manner. The delay between a retry would not be longer than the configured max retry delay. Also retryable errors will be retried indefinitely. Relates to #30086	2018-09-12 12:49:51 +02:00
Jason Tedor	20476b9e06	Disable CCR REST endpoints if CCR disabled (#33619 ) This commit avoids enabling the CCR REST endpoints if CCR is disabled.	2018-09-12 01:54:34 -04:00
Jason Tedor	eca37e6e0a	Expose CCR to the transport client (#33608 ) This commit exposes CCR to the transport client.	2018-09-11 16:37:52 -04:00
Martijn van Groningen	74d41857c6	mute test on windows Relates #33570	2018-09-10 16:49:17 +02:00
Martijn van Groningen	8eebca32d2	[CCR] Delay auto follow license check (#33557 ) * [CCR] Delay auto follow license check so that we're sure that there are auto follow patterns configured Otherwise we log a warning in case someone is running with basic or gold license and has not used the ccr feature.	2018-09-10 13:23:02 +02:00
Martijn van Groningen	c4adcee3ea	[CCR] Add create_follow_index privilege (#33559 ) This is a new index privilege that the user needs to have in the follow cluster. This privilege is required in addition to the `manage_ccr` cluster privilege in order to execute the create and follow api. Closes #33555	2018-09-10 13:08:20 +02:00
Jason Tedor	d1b99877fa	Remove underscore from auto-follow API (#33550 ) This commit removes the leading underscore from _auto_follow in the auto-follow API endpoints.	2018-09-09 14:42:49 -04:00
Nhat Nguyen	902d20cbbe	CCR: Use single global checkpoint to normalize range (#33545 ) We may use different global checkpoints to validate/normalize the range of a change request if the global checkpoint is advanced between these calls. If this is the case, then we generate an invalid request range.	2018-09-09 13:18:30 -04:00
Jason Tedor	6eca627409	Reverse logic for CCR license checks (#33549 ) This commit reverses the logic for CCR license checks in a few actions. This is done so that the successful case, which tends to be a larger block of code, does not require indentation.	2018-09-09 10:22:22 -04:00
Jason Tedor	edc492419b	Add latch countdown on failure in CCR license tests (#33548 ) We have some listeners in the CCR license tests that invoke Assert#fail if the onSuccess method for the listener is unexpectedly invoked. This can leave the main test thread hanging until the test suite times out rather than failing quickly. This commit adds some latch countdowns so that we fail quickly if these cases are hit.	2018-09-09 09:52:40 -04:00
Jason Tedor	c67b0ba33e	Create temporary directory if needed in CCR test In the multi-cluster-with-non-compliant-license tests, we try to write out a java.policy to a temporary directory. However, if this temporary directory does not already exist then writing the java.policy file will fail. This commit ensures that the temporary directory exists before we attempt to write the java.policy file.	2018-09-09 07:16:56 -04:00
Jason Tedor	5a38c930fc	Add license checks for auto-follow implementation (#33496 ) This commit adds license checks for the auto-follow implementation. We check the license on put auto-follow patterns, and then for every coordination round we check that the local and remote clusters are licensed for CCR. In the case of non-compliance, we skip coordination yet continue to schedule follow-ups.	2018-09-09 07:06:55 -04:00
Simon Willnauer	c12d232215	Pass Directory instead of DirectoryService to Store (#33466 ) Instead of passing DirectoryService which causes yet another dependency on Store we can just pass in a Directory since we will just call `DirectoryService#newDirectory()` on it anyway.	2018-09-07 14:00:24 +02:00
Nhat Nguyen	8afe09a749	Pass TranslogRecoveryRunner to engine from outside (#33449 ) This commit allows us to use different TranslogRecoveryRunner when recovering an engine from its local translog. This change is a prerequisite for the commit-based rollback PR. Relates #32867	2018-09-06 11:59:16 -04:00
Martijn van Groningen	ef207edbf0	test: do not schedule when test has stopped	2018-09-06 14:14:24 +02:00
Martijn van Groningen	cdd82bb203	test: fetch `SeqNoStats` inside try-catch block Relates to #33457	2018-09-06 11:49:08 +02:00
Martijn van Groningen	a721d09c81	[CCR] Added auto follow patterns feature (#33118 ) Auto Following Patterns is a cross cluster replication feature that keeps track whether in the leader cluster indices are being created with names that match with a specific pattern and if so automatically let the follower cluster follow these newly created indices. This change adds an `AutoFollowCoordinator` component that is only active on the elected master node. Periodically this component checks the the cluster state of remote clusters if there new leader indices that match with configured auto follow patterns that have been defined in `AutoFollowMetadata` custom metadata. This change also adds two new APIs to manage auto follow patterns. A put auto follow pattern api: ``` PUT /_ccr/_autofollow/{{remote_cluster}} { "leader_index_pattern": ["logs-*", ...], "follow_index_pattern": "{{leader_index}}-copy", "max_concurrent_read_batches": 2 ... // other optional parameters } ``` and delete auto follow pattern api: ``` DELETE /_ccr/_autofollow/{{remote_cluster_alias}} ``` The auto follow patterns are directly tied to the remote cluster aliases configured in the follow cluster. Relates to #33007 Co-authored-by: Jason Tedor jason@tedor.me	2018-09-06 08:01:58 +02:00
Jason Tedor	d71ced1b00	Generalize search.remote settings to cluster.remote (#33413 ) With features like CCR building on the CCS infrastructure, the settings prefix search.remote makes less sense as the namespace for these remote cluster settings than does a more general namespace like cluster.remote. This commit replaces these settings with cluster.remote with a fallback to the deprecated settings search.remote.	2018-09-05 20:43:44 -04:00
Nhat Nguyen	16b53b5ab5	Mute testValidateFollowingIndexSettings Tracked at #33379	2018-09-04 09:03:26 -04:00
Alpar Torok	7f7e8fd733	Disable assemble task instead of removing it (#33348 )	2018-09-04 07:32:14 +03:00
Nhat Nguyen	3a1dad1050	Mute testFollowIndexAndCloseNode Tracked at #33337	2018-09-02 19:17:51 -04:00
Nhat Nguyen	c6b011f8ea	TEST: Increase timeout testFollowIndexAndCloseNode (#33333 ) This test fails several times due to timeout when asserting the number of docs on the following and leading indices. This change reduces the number of docs to index and increases the timeout.	2018-09-02 09:28:47 -04:00
Martijn van Groningen	66b164c2a6	[CCR] Removed custom follow and unfollow api's reponse classes with AcknowledgedResponse (#33260 ) These response classes did not add any value and in that case just AcknowledgedResponse should be used. I also changed the formatting of methods to take one line per parameter in FollowIndexAction.java and UnfollowIndexAction.java files to make reviewing diffs in the future easier.	2018-08-31 21:16:06 +07:00
Nhat Nguyen	d3f32273eb	Merge branch 'master' into ccr	2018-08-30 23:22:58 -04:00
Martijn van Groningen	41c7fc8d37	[CCR] Introduce leader index name & last fetch time stats to stats api response (#33155 )	2018-08-29 10:54:58 +07:00
Nhat Nguyen	e2b931e80b	Use Lucene history in primary-replica resync (#33178 ) This commit makes primary-replica resyncer use Lucene as the source of history operation instead of translog if soft-deletes is enabled. With this change, we no longer expose translog snapshot directly in IndexShard. Relates #29530	2018-08-28 10:44:15 -04:00
Jason Tedor	5954354e62	Fix ShardFollowNodeTask.Status equals and hash code (#33189 ) These were broken when fetch exceptions were introduced to the status object but equals and hash code were not updated then. This commit addresses that.	2018-08-28 08:53:45 -04:00
Jason Tedor	cd91992c89	Only fetch mapping updates when necessary (#33182 ) Today we fetch the mapping from the leader and apply it as a mapping update whenever the index metadata version on the leader changes. Yet, the index metadata can change for many reasons other than a mapping update (e.g., settings updates, adding an alias, or a replica being promoted to a primary among many other reasons). This commit builds on the addition of a mapping version to the index metadata to only fetch mapping updates when the mapping version increases. This reduces the number of these fetches and application of mappings on the follower to the bare minimum.	2018-08-28 06:06:22 -04:00
Jason Tedor	0e5d42ca38	Merge branch 'master' into ccr * master: Adjust BWC version on mapping version Token API supports the client_credentials grant (#33106) Build: forked compiler max memory matches jvmArgs (#33138) Introduce mapping version to index metadata (#33147) SQL: Enable aggregations to create a separate bucket for missing values (#32832) Fix grammar in contributing docs SECURITY: Fix Compile Error in ReservedRealmTests (#33166) APM server monitoring (#32515) Support only string `format` in date, root object & date range (#28117) [Rollup] Move toBuilders() methods out of rollup config objects (#32585) Fix forbiddenapis on java 11 (#33116) Apply publishing to genreate pom (#33094) Have circuit breaker succeed on unknown mem usage Do not lose default mapper on metadata updates (#33153) Fix a mappings update test (#33146) Reload Secure Settings REST specs & docs (#32990) Refactor CachingUsernamePassword realm (#32646)	2018-08-27 13:49:59 -04:00
Martijn van Groningen	47e9e72df2	reduce maximum number of writes to speed up test	2018-08-27 12:14:46 +07:00
Jason Tedor	ef9607ea0c	Track fetch exceptions for shard follow tasks (#33047 ) This commit adds tracking and reporting for fetch exceptions. We track fetch exceptions per fetch, keeping track of up to the maximum number of concurrent fetches. With each failing fetch, we associate the from sequence number with the exception that caused the fetch. We report these in the CCR stats endpoint, and add some testing for this tracking.	2018-08-24 14:21:23 -04:00
Jason Tedor	7fa8a728c4	Make CCR QA tests build again (#33113 ) Welp, I broke this. I merged a change to auto-discover the CCR QA tests by making :x-pack:plugin:ccr:check auto-discover the check tasks in the qa sub-project. Yet, the check tasks for these sub-projects did not depend on the necessary test tasks (as we were previously doing this directly from the ccr build file. This commit fixes this!	2018-08-24 09:48:54 -04:00
Martijn van Groningen	b0f22d67c4	fixed not returning response instance	2018-08-24 16:56:29 +07:00
Martijn van Groningen	575f33941c	Required changes after merging in master branch.	2018-08-24 12:51:26 +07:00
Jason Tedor	9623cf6cde	Find CCR QA sub-projects automatically (#33027 ) Today we are by-hand maintaining a list of CCR QA sub-projects that the check task depends on. This commit simplifies this by finding these sub-projects automatically and adding their check task as dependencies of the CCR check task.	2018-08-21 12:51:55 -04:00
Jason Tedor	b08d02e3b7	Implement CCR licensing (#33002 ) This commit implements licensing for CCR. CCR will require a platinum license, and administrative endpoints will be disabled when a license is non-compliant.	2018-08-20 23:33:18 -04:00
Nhat Nguyen	919888eba7	TEST: Enable debug log testValidateFollowingIndexSettings	2018-08-06 14:55:56 -04:00
Nhat Nguyen	c394eb9ae9	CCR: Expose the operation primary term Relates #32442	2018-08-06 10:55:37 -04:00
Jason Tedor	3b739b9fd5	Avoid NPE on shard changes action (#32630 ) If a leader index is deleted while there is an active follower, the follower will send shard changes requests bound for the leader index. Today this will result in a null pointer exception because there will not be an index routing table for the index. A null pointer exception looks like a bug to a user so this commit addresses this by throwing an index not found exception instead.	2018-08-06 08:01:47 -04:00
Jason Tedor	32c2759bb9	Remove extra blank line in CcrStatsAction.java This commit removes an extra blank line that was accidentally committed to CcrStatsAction.java.	2018-08-03 09:55:04 -04:00
Jason Tedor	d640c9ddf9	Introduce CCR stats endpoint (#32350 ) This commit introduces the CCR stats endpoint which provides shard-level stats on the status of CCR follower tasks.	2018-08-03 09:09:45 -04:00
Jason Tedor	2387616c80	Remove _xpack from CCR APIs (#32563 ) For a new feature like CCR we will go without this extra layer of indirection. This commit replaces all /_xpack/ccr/_(\S+) endpoints by /_ccr/$1 endpoints.	2018-08-02 20:21:43 -04:00
Nhat Nguyen	8cfbb64d6e	ShardFollowNodeTask should fetch operation once (#32455 ) Today ShardFollowNodeTask might fetch some operations more than once. This happens because we ask the leading for up to max_batch_count operations (instead of the left-over size) for the left-over request. The leading then can freely respond up to the max_batch_count, and at the same time, if one of the previous requests completed, we might issue another read request whose range overlaps with the response of the left-over request. Closes #32453	2018-07-30 20:53:09 -04:00
Nhat Nguyen	aa3b6e098c	Reject follow request if following setting not enabled on follower (#32448 ) Today we do not check if the `following_index` setting of the follower is enabled or not when processing a follow-request. If that setting is disabled, the follower will use the default engine, not the following engine. This change checks and rejects such invalid follow requests. Relates #30086	2018-07-29 21:57:45 -04:00
Nhat Nguyen	8474f8a01c	Validate source of an index in LuceneChangesSnapshot (#32288 ) Today it's possible to encounter an Index operation in Lucene whose _source is disabled, and _recovery_source was pruned by the MergePolicy. If it's the case, we create a Translog#Index without source and let the caller validate it later. However, this approach is challenging for the caller. Deletes and No-Ops don't allow invoking "source()" method. The caller has to make sure to call "source()" only on index operations. The current implementation in CCR does not follow this and fail to replica deletes or no-ops. Moreover, it's easier to reason if a Translog#Index always has the source.	2018-07-27 08:16:52 -04:00
Nhat Nguyen	cd8b80da58	Use shadow plugin in ccr/qa	2018-07-25 00:16:33 -04:00
Nhat Nguyen	a5d8f0b55a	CCR: use shadow plugin Relates #32240	2018-07-24 22:48:11 -04:00
Nhat Nguyen	88190299df	CCR: Fix incorrect read request completion condition (#32266 ) Today we consider a read request is exhausted if from_seqno is equal to or greater than the max_required_seqno. However, if we stop when from_seqno equals to the max_required_seqno, we will miss an operation whose seqno is max_required_seqno because we have not seen that operation yet.	2018-07-22 22:14:27 -04:00
Martijn van Groningen	b6b596e471	[CCR] Add random shard follow task test (#32188 ) Added shard follow task unit tests that tests whether the shard follow task is able to process randomly generated shard changes api responses.	2018-07-21 12:38:05 +02:00
Nhat Nguyen	8e15504443	TEST: Fix range issue in ShardChangesActionTests We modified the way we calculate to_seqno in #32121 but did not adjust this test accordingly. If min_seqno equals to max_seqno, the size should be one instead of zero. Relates #32121	2018-07-20 17:20:41 -04:00
Nhat Nguyen	fe574f89f8	CCR: Translog op on primary should have versionType Normally translog operations will not be replayed on the primary. Following engine is an exception where we replay translog on both primary and replica as a non-primary strategy. Even though we won't use the version_type in the following engine, we still need to pass a valid value for the primary operation in order not to trip assertions in an engine. This commit passes version_type EXTERNAL for translog operation if its origin is primary. Relates #31945	2018-07-20 08:39:38 -04:00
Martijn van Groningen	a6b7497fdc	[CCR] Add more unit tests for shard follow task (#32121 ) The added tests are based on specific scenarios as described in the test plan. Before this change the ShardFollowNodeTaskTests contained more random like tests, but these have been removed and in a followup pr better random tests will be added in a new test class as is described in the test plan.	2018-07-20 14:12:05 +02:00
Nhat Nguyen	d0f3ed5abd	Merge branch 'master' into ccr * master: Painless: Simplify Naming in Lookup Package (#32177) Handle missing values in painless (#32207) add support for write index resolution when creating/updating documents (#31520) ECS Task IAM profile credentials ignored in repository-s3 plugin (#31864) Remove indication of future multi-homing support (#32187) Rest test - allow for snapshots to take 0 milliseconds Make x-pack-core generate a pom file Rest HL client: Add put watch action (#32026) Build: Remove pom generation for plugin zip files (#32180) Fix comments causing errors with Java 11 Fix rollup on date fields that don't support epoch_millis (#31890) Detect and prevent configuration that triggers a Gradle bug (#31912) [test] port linux package packaging tests (#31943) Revert "Introduce a Hashing Processor (#31087)" (#32178) Remove empty @return from JavaDoc Adjust SSLDriver behavior for JDK11 changes (#32145) [test] use randomized runner in packaging tests (#32109) Add support for field aliases. (#32172) Painless: Fix caching bug and clean up addPainlessClass. (#32142) Call setReferences() on custom referring tokenfilters in _analyze (#32157) Fix BwC Tests looking for UUID Pre 6.4 (#32158) Improve docs for search preferences (#32159) use before instead of onOrBefore Add more contexts to painless execute api (#30511) Add EC2 credential test for repository-s3 (#31918) A replica can be promoted and started in one cluster state update (#32042) Fix Java 11 javadoc compile problem Fix CP for namingConventions when gradle home has spaces (#31914) Fix `range` queries on `_type` field for singe type indices (#31756) [DOCS] Update TLS on Docker for 6.3 (#32114) ESIndexLevelReplicationTestCase doesn't support replicated failures but it's good to know what they are Remove versionType from translog (#31945) Switch distribution to new style Requests (#30595) Build: Skip jar tests if jar disabled Painless: Add PainlessClassBuilder (#32141) Build: Make additional test deps of check (#32015) Disable C2 from using AVX-512 on JDK 10 (#32138) Build: Move shadow customizations into common code (#32014) Painless: Fix Bug with Duplicate PainlessClasses (#32110) Remove empty @param from Javadoc Re-disable packaging tests on suse boxes Docs: Fix missing example script quote (#32010) [ML] Wait for aliases in multi-node tests (#32086) [ML] Move analyzer dependencies out of categorization config (#32123) Ensure to release translog snapshot in primary-replica resync (#32045) Handle TokenizerFactory TODOs (#32063) Relax TermVectors API to work with textual fields other than TextFieldType (#31915) Updates the build to gradle 4.9 (#32087) Mute :qa:mixed-cluster indices.stats/10_index/Index - all’ Check that client methods match API defined in the REST spec (#31825) Enable testing in FIPS140 JVM (#31666) Fix put mappings java API documentation (#31955) Add exclusion option to `keep_types` token filter (#32012) [Test] Modify assert statement for ssl handshake (#32072)	2018-07-19 23:03:01 -04:00
Martijn van Groningen	d88c76e02b	[CCR] Initial replication group based tests (#32024 ) Tests shard follow task in the context of a leader and follower ReplicationGroup, in order to test how the shard follow logic reacts to certain shard related failure scenarios. More tests will need to be added, but this indicates what changes need to be made to have these tests. Relates to #30102	2018-07-17 17:39:49 +02:00
Martijn van Groningen	006c79a80d	[CCR] Improve retry mechanism when making remote calls from shard follow task (#31930 ) Closes #31816	2018-07-17 10:25:51 +02:00
Martijn van Groningen	815faf34fc	[CCR] Move api parameters from url to request body. (#31949 ) Relates to #30102	2018-07-11 10:16:43 +02:00
Martijn van Groningen	8e1ef0cff9	Rewrite shard follow node task logic (#31581 ) The current shard follow mechanism is complex and does not give us easy ways the have visibility into the system (e.g. why we are falling behind). The main reason why it is complex is because the current design is highly asynchronous. Also in the current model it is hard to apply backpressure other than reducing the concurrent reads from the leader shard. This PR has the following changes: * Rewrote the shard follow task to coordinate the shard follow mechanism between a leader and follow shard in a single threaded manner. This allows for better unit testing and makes it easier to add stats. * All write operations read from the shard changes api should be added to a buffer instead of directly sending it to the bulk shard operations api. This allows to apply backpressure. In this PR there is a limit that controls how many write ops are allowed in the buffer after which no new reads will be performed until the number of ops is below that limit. * The shard changes api includes the current global checkpoint on the leader shard copy. This allows reading to be a more self sufficient process; instead of relying on a background thread to fetch the leader shard's global checkpoint. * Reading write operations from the leader shard (via shard changes api) is a separate step then writing the write operations (via bulk shards operations api). Whereas before a read would immediately result into a write. * The bulk shard operations api returns the local checkpoint on the follow primary shard, to keep the shard follow task up to date with what has been written. * Moved the shard follow logic that was previously in ShardFollowTasksExecutor to ShardFollowNodeTask. * Moved over the changes from #31242 to make shard follow mechanism resilient from node and shard failures. Relates to #30086	2018-07-10 16:00:55 +02:00
Martijn van Groningen	ac654cbc10	Follow engine should not fill gaps upon promotion and recovery (#31751 ) Closes #31318	2018-07-03 13:15:06 +02:00
Martijn van Groningen	8ecfcc3b80	muted tests that will be replaced by the shard follow task refactoring: https://github.com/elastic/elasticsearch/pull/31581	2018-06-29 11:47:46 +02:00
Nhat Nguyen	1185ddbcc6	Replaces testClassesDir with testClassesDirs in ccr build Relates #30389	2018-06-28 11:24:41 -04:00
Nhat Nguyen	2c56df631d	Adjusts transport actions in CCR This commit adjusts the ccr’s actions accordingly to the recent changes in the upstream.	2018-06-23 18:10:15 -04:00
Nhat Nguyen	34f127be3c	CCR: Remove index name resolver from CCR actions Relates #31002	2018-06-20 13:20:24 -04:00
Nhat Nguyen	c74cd30ac6	Remove request type parameter from CCR actions Relates #31405	2018-06-19 10:49:05 -04:00
Martijn van Groningen	50ce990305	added missing serialization tests	2018-06-19 10:22:58 +02:00
Martijn van Groningen	73c9dd976b	Remove action request builders.	2018-06-15 12:32:08 +02:00
Tanguy Leroux	18938aab39	Adapt ShardFollowTasksExecutor after #31031	2018-06-15 11:46:08 +02:00
Martijn van Groningen	cc824ebb5e	[CCR] Added more validation to follow index api. (#31068 )	2018-06-15 07:39:53 +02:00
Nhat Nguyen	1ccb34ac77	Remove unused imports	2018-06-14 11:44:20 -04:00
Jason Tedor	64b4cdeda6	Merge remote-tracking branch 'elastic/master' into ccr * elastic/master: (53 commits) Painless: Restructure/Clean Up of Spec Documentation (#31013) Update ignore_unmapped serialization after backport Add back dropped substitution on merge high level REST api: cancel task (#30745) Enable engine factory to be pluggable (#31183) Remove vestiges of animal sniffer (#31178) Rename elasticsearch-nio to nio (#31186) Rename elasticsearch-core to core (#31185) Move cli sub-project out of server to libs (#31184) [DOCS] Fixes broken link in auditing settings QA: Better seed nodes for rolling restart [DOCS] Moves ML content to stack-docs [DOCS] Clarifies recommendation for audit index output type (#31146) Add nio-transport as option for http smoke tests (#31162) QA: Set better node names on rolling restart tests Add support for ignore_unmapped to geo sort (#31153) Share common parser in some AcknowledgedResponses (#31169) Fix random failure on SearchQueryIT#testTermExpansionExceptionOnSpanFailure Remove reference to multiple fields with one name (#31127) Remove BlobContainer.move() method (#31100) ...	2018-06-07 23:33:42 -04:00
Simon Willnauer	5c6711b8a4	Use a `_recovery_source` if source is omitted or modified (#31106 ) Today if a user omits the `_source` entirely or modifies the source on indexing we have no chance to re-create the document after it has been added. This is an issue for CCR and recovery based on soft deletes which we are going to make the default. This change adds an additional recovery source if the source is disabled or modified that is only kept around until the document leaves the retention policy window. This change adds a merge policy that efficiently removes this extra source on merge for all document that are live and not in the retention policy window anymore.	2018-06-07 07:39:28 +02:00
Jason Tedor	20a2f646e2	Fix off-by-one error in chunks coordinator (#31147 ) This commit fixes an off-by-error in the chunks coordinator where the batches would be of size one more than the batch size.	2018-06-06 19:53:49 -04:00
Jason Tedor	bf1152fcc6	Use follower primary term when applying operations (#31113 ) The primary shard copy on the following has authority of the replication operations that occur on the following side in cross-cluster replication. Yet today we are using the primary term directly from the operations on the leader side. Instead we should be replacing the primary term on the following side with the primary term of the primary on the following side. This commit does this by copying the translog operations with the corrected primary term. This ensures that we use this primary term while applying the operations on the primary, and when replicating them across to the replica (where the replica request was carrying the primary term of the primary shard copy on the follower).	2018-06-06 11:03:57 -04:00
Jason Tedor	d230548401	Remove use of deprecated methods to perform request (#31117 ) The old perform request methods on the REST client have been deprecated in favor using request-flavored methods. This commit addresses the use of these deprecated methods in the CCR test suite.	2018-06-06 05:09:55 -04:00
Nhat Nguyen	6ee6404e94	Adapt changes in PersistentTaskParams Relates #31045	2018-06-04 14:48:04 -04:00
Nhat Nguyen	87abb49145	Adapt changes in AcknowledgeResponse Relates #30983	2018-06-04 14:47:22 -04:00
Nhat Nguyen	9564b60194	Adjust CCR Actions after RequestBuilder is removed CCR side of #30966	2018-06-01 23:09:59 -04:00
Nhat Nguyen	2a9a2002e6	CCR: Tighten requesting range check on leader This commit clarifies the origin of the global checkpoint that the following shard uses and replaces illegal_state_exc E by an assertion. Relates #30980	2018-05-31 20:00:33 -04:00
Nhat Nguyen	fa54be2dcd	CCR: Do not minimization requesting range on leader (#30980 ) Today before reading operations on the leading shard, we minimization the requesting range with the global checkpoint. However, this might make the request invalid if the following shard generates a requesting range based on the global-checkpoint from a primary shard and sends that request to a replica whose global checkpoint is lagged. Another issue is that we are mutating the request when applying minimization. If the request becomes invalid on a replica, we will reroute the mutated request instead of the original one to the primary. This commit removes the minimization and replaces it by a range check with the local checkpoint.	2018-05-31 15:14:32 -04:00
Martijn van Groningen	7e8cf768cf	changed persistent task name to be of similar structure as the others	2018-05-31 15:16:13 +02:00
Martijn van Groningen	a82f2e31b4	[CCR] Also copy routing_num_shards from leader to follow index. (#30894 ) Bug was introduced when create and follow api was added in #30602	2018-05-31 08:03:53 +02:00
Nhat Nguyen	f25ee254cc	Mute ShardChangesIT#testFollowIndex	2018-05-30 14:29:58 -04:00
Martijn van Groningen	adca32eae7	no need to resolve index name as only concrete index names are used	2018-05-30 12:42:35 +02:00
Martijn van Groningen	4a20dca5fe	Required changes after merging in master.	2018-05-30 10:26:49 +02:00
Martijn van Groningen	51caefe46c	[CCR] Sync mappings between leader and follow index (#30115 ) The shard changes api returns the minimum IndexMetadata version the leader index needs to have. If the leader side is behind on IndexMetadata version then follow shard task waits with processing write operations until the mapping has been fetched from leader index and applied in follower index in the background. The cluster state api is used to fetch the leader mapping and put mapping api to apply the mapping in the follower index. This works because put mapping api accepts fields that are already defined. Relates to #30086	2018-05-28 07:37:27 +02:00
Martijn van Groningen	e477147143	[CCR] Add create and follow api (#30602 ) Also renamed FollowExisting* internal names to just Follow* and fixed tests	2018-05-26 15:05:40 +02:00
Martijn van Groningen	7942e4082a	build: enhance check task instead of overwriting it. (test task didn't run when check task ran)	2018-05-16 10:54:15 +02:00
Martijn van Groningen	596ec1848e	[CCR] Add validation checks that were left out of #30120 (#30463 )	2018-05-16 09:46:03 +02:00
Martijn van Groningen	23204e3d09	[CCR] Fixed follow and unfollow api url path according to design. The TODOs in the rest actions was incorrect. The problem was that these rest actions used `follow_index` as first named variable in the path under which the rest actions were registered. Other candidate rest actions that also have a named variable as first element in the path (but with a different name) get resolved as rest parameters too and passed down to the rest action that actually ends up getting executed. In the case of the follow index api, a `index` parameter got passed down to `RestFollowExistingAction`, but that param was never used. This caused the follow index api call to fail, because of unused http parameters. This change doesn't fixes that problem, but works around it by using `index` as named variable for the follow index (instead of `follow_index`). Relates to #30102	2018-05-16 09:07:50 +02:00
Martijn van Groningen	64b97313d5	[CCR] Make cross cluster replication work with security (#30239 ) If security is enabled today with ccr then the follow index api will fail with the fact that system user does not have privileges to use the shard changes api. The reason that system user is used is because the persistent tasks that keep the shards in sync runs in the background and the user that invokes the follow index api only start those background processes. I think it is better that the system user isn't used by the persistent tasks that keep shards in sync, but rather runs as the same user that invoked the follow index api and use the permissions that that user has. This is what this PR does, and this is done by keeping track of security headers inside the persistent task (similar to how rollup does this). This PR also adds a cluster ccr priviledge that allows a user to follow or unfollow an index. Finally if a user that wants to follow an index, it needs to have read and monitor privileges on the leader index and monitor and write privileges on the follow index.	2018-05-16 07:48:32 +02:00
Martijn van Groningen	bb6586dc5f	[CCR] Read changes from Lucene instead of translog (#30120 ) This commit adds an API to read translog snapshot from Lucene, then cut-over from the existing translog to the new API in CCR. Relates #30086 Relates #29530	2018-05-09 17:35:27 -04:00
Martijn van Groningen	ad499fc178	[CCR] added rest specs and simple rest test for follow and unfollow apis (#30123 ) [CCR] added rest specs and simple rest test for follow and unfollow apis, also Added an acknowledge field in follow and unfollow api responses. Currently these api return an empty response and fixed bug in unfollow api that didn't cleanup node tasks properly.	2018-05-07 14:18:28 +02:00
Nhat Nguyen	6e0d0feca0	Enable MockHttpTransport in ShardChangsIT CCR side of #29601	2018-05-04 13:44:18 -04:00
Nhat Nguyen	8fefa8a661	Update InternalEngine tests on ccr side for #30121 Relates #30121	2018-05-04 10:57:54 -04:00
Nhat Nguyen	d621fc7a00	Add tombstone document into Lucene for Noop (#30226 ) This commit adds a tombstone document into Lucene for every No-op. With this change, Lucene index is expected to have a complete history of operations like Translog. In fact, this guarantee is subjected to the soft-deletes retention merge policy. Relates #29530	2018-05-02 09:08:29 -04:00
Nhat Nguyen	eb4281edef	CCR side #30244 Relates #30244	2018-05-01 21:08:24 -04:00
Martijn van Groningen	8a2df6c3b9	[CCR] Only normalize -1 seqno in shard changes request. (#30238 ) Prior to this change a -1 seqno would be normalized earlier, which caused a leader shard containing a single operation to be ignored. Closes #30227	2018-05-01 08:40:23 +02:00
Martijn van Groningen	e6b88fa5a0	removed duplicated license	2018-04-25 12:18:02 +02:00
Martijn van Groningen	5a67a0f78f	Applying changes required for ccr after moving ccr code to elasticsearch	2018-04-25 08:03:29 +02:00
Martijn van Groningen	9b9d0f9057	Enabled licence header check and fixed unchecked casts. (#4408 )	2018-04-20 11:15:52 +02:00
Martijn van Groningen	cfd7847628	fixed issues after merging in master	2018-04-20 07:59:13 +02:00
Nhat Nguyen	f97aec7b8b	Sibling of enforce access to translog via engine Since elastic/elasticsearch#29542, we no longer expose translog instance but only provide creating translog snapshot method. This commit adapts that change in CCR branch. Relates elastic/elasticsearch#29542	2018-04-18 11:54:00 -04:00
Martijn van Groningen	56ca59a513	Add the ability to the follow index to follow an index in a remote cluster. The follow index api completely reuses CCS infrastructure that was exposed via: https://github.com/elastic/elasticsearch/pull/29495 This means that the leader index parameter support the same ccs index to indicate that an index resides in a different cluster. I also added a qa module that smoke tests the cross cluster nature of ccr. The idea is that this test just verifies that ccr can read data from a remote leader index and that is it, no crazy randomization or indirectly testing other features.	2018-04-17 07:36:40 +02:00
Martijn van Groningen	c0d42e9cd1	Fixed test	2018-04-16 10:48:46 +02:00
Martijn van Groningen	a94b38b88e	Fixed compile errors and test failures after merging master into ccr.	2018-04-13 16:35:09 +02:00
Martijn van Groningen	d77f756f5c	ccr: use indices stats api to fetch global checkpoint of the follower shards and keep track of shard follow stats inside shard follow stats' node task instead of persistent task status. By maintaining the shard follow stats inside its node task the stats update is quicker as no cluster state update is required. The stats are now transient; meaning if the task is going to run a different node then the stats are gone too. Currently only the processed global checkpoint is being tracked and this is being restored when a shard follow node task starts via the indices stats api (the reason of the first change of this change). Other stats that we may add in the future (like fetch_time, see: https://gist.github.com/s1monw/dba13daf8493bf48431b72365e110717) it is ok if we start from zero in case a shard follow task moves to another node.	2018-04-05 14:52:20 +02:00
Martijn van Groningen	d976fa44e7	Removed LocalCheckpointTracker usage.	2018-03-29 07:41:23 +02:00
Martijn van Groningen	a22a7d079d	ccr: Added maximum translog limit that a single shard changes response can return. This limit is based on the number of estimate bytes in each translog operation that fall between the minimum and maximum request sequence number. If this limit is met then the shard follow task executor will make sure that a subsequent shard changes request will be performed to fetch the remaining translog operations. This limit is needed in order to protect against returning too many translog operations in a single shard changes response. Relates to #2436	2018-03-28 15:49:57 +02:00
Martijn van Groningen	282740610b	Fixed test after merging in master branch.	2018-03-28 09:54:41 +02:00
Nhat Nguyen	51111a8106	CCR: Stop FollowExistingIndexAction after report failure (#4111 ) We check for the existence of both leader and follower index, then properly report to the caller. However, we do not return after reporting failure. This causes the caller receive exception twice: IllegalArgumentException then NullPointerException. This commit makes sure to stop the action after reporting failure.	2018-03-26 13:56:47 -04:00
Martijn van Groningen	9e4c68c389	Fixed compile and test errors after merging in master	2018-03-16 17:47:10 +01:00
Martijn van Groningen	10cfa21a68	required changes after merge master branch into ccr branch.	2018-02-22 15:03:33 +01:00
Martijn van Groningen	1a9a7ffe97	removed hack	2018-02-07 17:54:28 +01:00
Martijn van Groningen	c442d14f1d	Several changes that were required after merging master into the ccr branch.	2018-02-05 13:25:58 +01:00
Martijn van Groningen	4e818254ad	re-enabled java integration tests	2018-01-25 14:18:34 +01:00
Martijn van Groningen	05d3d2e49c	fix packages after merge	2018-01-24 09:28:42 +01:00
Jason Tedor	9b6bb2c635	Enable run task for CCR This commit enables the run task for ccr by specifying that the ccr project not be evaluated until after core is evaluated. This is important since ccr is alphabetically before core and thus Gradle evaluates it first. Relates #3665	2018-01-22 15:07:20 -05:00
Martijn van Groningen	83a82d83d0	Moved ccr source code to its own gradle module after xpack split.	2018-01-22 11:09:04 +01:00

... 8 9 10 11 12

582 Commits