OpenSearch

Commit Graph

Author	SHA1	Message	Date
Boaz Leskes	d6c2b4f7c5	Adapt InternalTestCluster to auto adjust `minimum_master_nodes` (#21458 ) #20960 removed `LocalDiscovery` and we now use `ZenDiscovery` in all our tests. To keep cluster forming fast, we are using a `MockZenPing` implementation which uses static maps to return instant results making master election fast. Currently, we don't set `minimum_master_nodes` causing the occasional split brain when starting multiple nodes concurrently and their pinging is so fast that it misses the fact that one of the node has elected it self master. To solve this, `InternalTestCluster` is modified to behave like a true cluster and manage and set `minimum_master_nodes` correctly with every change to the number of nodes. Tests that want to manage the settings themselves can opt out using a new `autoMinMasterNodes` parameter to the `ClusterScope` annotation. Having `min_master_nodes` set means the started node may need to wait for other nodes to be started as well. To combat this, we set `discovery.initial_state_timeout` to `0` and wait for the cluster to form once all node have been started. Also, because a node may wait and ping while other nodes are started, `MockZenPing` is adapted to wait rather than busy-ping.	2016-11-15 13:42:26 +00:00
Simon Willnauer	66fbb0dbc2	Don't fail in `afterExecute` if context is already closed (#21563 ) We run an assert on an potentially closed thread context. this should not bubble up the `IllegalStateException`.	2016-11-15 13:55:50 +01:00
Adrien Grand	54809065a6	Make PercolatorFieldMapper get a QueryShardContext lazily.	2016-11-15 12:02:40 +01:00
Simon Willnauer	200a2850a9	[TEST] Don't stop MockAppender some nodes might concurrently use it	2016-11-15 10:48:39 +01:00
Boaz Leskes	6d9af2fff4	Uncommitted mapping updates should not efect existing indices (#21306 ) When processing a mapping updates, the master current creates an `IndexService` and uses its mapper service to do the hard work. However, if the master is also a data node and it already has an instance of `IndexService`, we currently reuse the the `MapperService` of that instance. Sadly, since mapping updates are change the in memory objects, this means that a mapping change that can rejected later on during cluster state publishing will leave a side effect on the index in question, bypassing the cluster state safety mechanism. This commit removes this optimization and replaces the `IndexService` creation with a direct creation of a `MapperService`. Also, this fixes an issue multiple from multiple shards for the same field caused unneeded cluster state publishing as the current code always created a new cluster state. This were discovered while researching #21189	2016-11-15 10:47:34 +01:00
Adrien Grand	ad94bea0bb	Remove XPointValues. (#21541 ) This class had been added to address a bug in PointValues, which has been fixed since then.	2016-11-15 10:11:41 +01:00
Martijn van Groningen	8a3a885058	inner_hits: Skip adding a parent field to nested documents. Otherwise an empty string get added as _parent field. Closes #21503	2016-11-15 07:32:28 +01:00
Ryan Ernst	c7bd4f3454	Tests: Add TestZenDiscovery and replace uses of MockZenPing with it (#21488 ) This changes adds a test discovery (which internally uses the existing mock zenping by default). Having the mock the test framework selects be a discovery greatly simplifies discovery setup (no more weird callback to a Node method).	2016-11-14 21:46:10 -08:00
Ryan Ernst	d14c470b89	Remove generics from ActionRequest closes #21368	2016-11-14 15:32:01 -08:00
Jason Tedor	48579cccab	Add socket permissions for tribe nodes Today when a node starts, we create dynamic socket permissions based on the configured HTTP ports and transport ports. If no ports are configured, we use the default port ranges. When a tribe node starts, a tribe node creates an internal node client for connecting to each remote cluster. If neither an explicit HTTP port nor transport ports were specified, the default port ranges are large enough for the tribe node and its internal node clients. If an explicit HTTP port or transport port was specified for the tribe node, then socket permissions for those ports will be created, but not for the internal node clients. Whether the internal node clients have explicit ports specified, or attempt to bind within the default range, socket permissions for these will not have been created and the internal node clients will hit a permissions issue when attempting to bind. This commit addresses this issue by also accounting for tribe nodes when creating the dynamic socket permissions. Additionally, we add our first real integration test for tribe nodes. Relates #21546	2016-11-14 15:09:45 -05:00
Jay Modi	87d76c3ff8	assert blocking calls are not made on the cluster state update thread This commit adds an assertion to ensure that we do not introduce blocking calls in code that is called in a ClusterStateListener or another part of the cluster state update process.	2016-11-14 14:30:01 -05:00
Jason Tedor	9fb54f4ef8	Remove unnecessary hash map copy in o.e.b.Security This commit removes an unnecessary copying of the tribe node group settings in o.e.b.Security.	2016-11-14 13:49:16 -05:00
Jason Tedor	a12f09317d	Fallback to settings if transport profile is empty If the transport profile does not contain a TCP port range, we fallback to the top-level settings.	2016-11-14 13:48:12 -05:00
Jason Tedor	491a945ac8	Add socket permissions for tribe nodes Today when a node starts, we create dynamic socket permissions based on the configured HTTP ports and transport ports. If no ports are configured, we use the default port ranges. When a tribe node starts, a tribe node creates an internal node client for connecting to each remote cluster. If neither an explicit HTTP port nor transport ports were specified, the default port ranges are large enough for the tribe node and its internal node clients. If an explicit HTTP port or transport port was specified for the tribe node, then socket permissions for those ports will be created, but not for the internal node clients. Whether the internal node clients have explicit ports specified, or attempt to bind within the default range, socket permissions for these will not have been created and the internal node clients will hit a permissions issue when attempting to bind. This commit addresses this issue by also accounting for tribe nodes when creating the dynamic socket permissions. Additionally, we add our first real integration test for tribe nodes.	2016-11-14 11:58:44 -05:00
Simon Willnauer	1d8c8529ed	Remove `IndexTemplateAlreadyExistsException` and `IndexShardAlreadyExistsException` (#21539 ) Both exception can be replaced with java built-in exception, IAE and ISE respectively. This should be back ported partially to 5.x which the transport layer code should be preserved. Relates to #21494	2016-11-14 17:09:57 +01:00
Simon Willnauer	26375256ff	Enable 5.x to 6.x BWC tests (#21537 ) This commit enables real BWC testing against a 5.1 snapshot. All REST tests plus rolling upgrade test now run against a mixed version cross major version cluster.	2016-11-14 17:03:57 +01:00
Yannick Welsch	d3e97ce6cd	Fix line length in TCPTransportTests Makes checkstyle happy	2016-11-14 16:55:14 +01:00
Yannick Welsch	d42f7eec61	Check valid cluster service state transitions (#21538 ) This commit adds assertions to check whether the cluster service state transitions in a way that we expect it to. Relates to #21379.	2016-11-14 16:49:25 +01:00
Simon Willnauer	26a8a94e56	[TEST] Add test to ensure `transport.tcp.compress` works This adds a basic unittest to ensure `transport.tcp.compress` has effect on all basic TcpTransport implementations. Relates to #21526	2016-11-14 16:13:44 +01:00
Simon Willnauer	7d4bde8e00	remove forbidden API	2016-11-14 15:30:07 +01:00
Yannick Welsch	8655cd7182	Add assertion that checks that the same shard with same id is not added to same node (#21498 ) Adds an assertion that checks that the same shard with same id is not added to same node. Previously we would just silently ignore the second shard being added.	2016-11-14 15:14:14 +01:00
Simon Willnauer	bdc942fa72	Enable 5.x to 6.x BWC tests This commit enables real BWC testing against a 5.1 snapshot. All REST tests plus rolling upgrade test now run against a mixed version cross major version cluster.	2016-11-14 14:26:49 +01:00
Adrien Grand	1fd5c47e7f	Upgrade to lucene-6.3.0. (#21464 )	2016-11-14 09:36:45 +01:00
Jason Tedor	19decd7552	Hack around cluster service and logging race When a cluster update task executes, there can be log messages after the update task has finished processing and the new cluster state becomes visible. The visibility of the cluster state allows the test thread in UpdateSettingsIT#testUpdateAutoThrottleSettings and UpdateSettingsiT#testUpdateMergeMaxThreadCount to proceed. The test thread will remove and stop a mock appender setup at the beginning of the test. The log messages in the cluster state update task that occur after processing has finished can race with the removal of the appender. Log4j will grab a reference to the appenders when processing these log messages, and this races with the removal and stopping of the appenders. If Log4j grabs a reference to the appenders before the mock appender has been removed, and the test thread subsequently removes and stops the appender before Log4j has appended the log message, Log4j will get angry that we are appending to a stopped appender, causing the test to fail. This commit addresses this race by waiting for the cluster state update task to have finished processing before freeing the test thread to make its assertions and finally remove and stop the appender. Yes, this is a hack. Relates #21518	2016-11-13 18:06:12 -05:00
Jason Tedor	d273419d00	Do not prematurely shutdown Log4j When a node closes, we shutdown logging as the last statement. This statement must be last lest any subsequent attempts to log will blow up by running into security permissions. Yet, in the case of a tribe node this isn't enough. The first internal tribe node to close will shutdown logging, and subsequent node closes will blow up with the aforementioned problem. This commit migrate the Log4j shutdown to occur as part of the shutdown hook that closes the node, after all nodes have closed. Consequently, we can remove a hack in the test infrastructure to prevent Log4j shutdowns when internal test nodes close and instead just register a single shutdown hook that runs when the test JVM exits. Relates #21519	2016-11-13 17:27:30 -05:00
Ali Beyad	38023fb58d	[TEST] testRebalancePossible() may not have an assigned node id	2016-11-11 23:10:34 -05:00
Igor Motov	ca639e8c86	Tests: Disable merge in SearchCancellationTests We have to have at least 2 segments for the test to work and sometimes random merge policy merges them into one.	2016-11-11 18:22:28 -05:00
Igor Motov	058b6e019c	Tests: clean search scroll at the end of SearchCancellationIT Under some rare conditions search cancellation response might not fully clean scroll context. For now this commit adds the cleaning operation to the test, and we will address the root cause in https://github.com/elastic/elasticsearch/issues/21511	2016-11-11 18:22:15 -05:00
Ali Beyad	5f1d108704	[TEST] reduce the number of snapshotted shards to 1 in testSnapshotSucceedsAfterSnapshotFailure() so that we are more likely to trigger I/O exceptions on writing the control files during the finalize phase of snapshotting (with the aim of triggering an I/O failure when writing pending-index-*).	2016-11-11 16:22:11 -05:00
Jason Tedor	9352d16602	Enable appender exceptions in UpdateSettingsIT This commit sets the mock appender in UpdateSettingsIT to not ignore exceptions. This means that when an exception is hit, we will see an actual stack trace that could be useful in debugging a non-reproducible test failure. Relates #21461	2016-11-11 12:41:20 -05:00
Ali Beyad	c9c3992f94	[TEST] remove AwaitsFix from testSnapshotSucceedsAfterSnapshotFailure, turns out the issue is specific to Java 9 v143	2016-11-11 12:37:04 -05:00
Jason Tedor	79076334ae	Cleanup formatting in UpdateSettingsIT.java This commit cleans up some code formatting in UpdateSettingsIT.java and removes this from from the checkstyle line-length supressions.	2016-11-11 12:10:32 -05:00
Ali Beyad	8f85e388da	[TEST] mute the testSnapshotSucceedsAfterSnapshotFailure() test until its clear what is going wrong. Relates #21496	2016-11-11 11:50:23 -05:00
Jason Tedor	372480a16a	Mark SearchQueryIT test as awaits fix This commit marks the test SearchQueryIT#testRangeQueryWithTimeZone as awaits fix. Relates #21501	2016-11-11 11:33:17 -05:00
Yannick Welsch	9cbb23f3d7	Test distinctNodes	2016-11-11 17:29:51 +01:00
Ali Beyad	a5ccd02e76	Makes snapshot throttling test go much faster (#21485 ) [TEST] Makes the snapshot throttling test go much faster. Before, the snapshot throttling test would throttle at a rate of 0.5 kb per second, even though it would snapshot/restore about 25 kb of data. This commit increases the throttling rate to 10kb per second, so we still test the throttling mechanism while speeding up the test from taking 30 plus seconds down to 2 seconds or less.	2016-11-11 10:52:26 -05:00
Yannick Welsch	d195ef258b	test fix	2016-11-11 16:09:34 +01:00
Yannick Welsch	1635baf876	fix tests that add duplicate shards	2016-11-11 15:28:40 +01:00
Yannick Welsch	7099f10909	Add assertion that checks that the same shard with same id is not added to same node	2016-11-11 15:28:40 +01:00
Ali Beyad	adb7aaded4	[TEST] adds randomness between atomic and non-atomic move operations in MockRepository	2016-11-11 09:07:28 -05:00
Yannick Welsch	2d3a52c0f2	Cache successful shard deletion checks (#21438 ) Each node checks on every cluster state update if there are shards that it can possibly delete from its disk. It decides this by doing a file-system lookup for each shard id that is fully allocated in the cluster. With lots of shards, this amounts to lots of Files.exists() checks, considerably slowing down cluster state updates. This commit adds a caching layer so that the Files.exists() checks can be skipped if not needed.	2016-11-11 10:06:15 +01:00
Igor Motov	df965fc9b3	Task cancellation command should wait for all child nodes to receive cancellation request before returning Currently the task cancellation command returns as soon as the top-level parent child is marked as cancelled. This create race conditions in tests where child tasks on other nodes may continue to run for some time after the main task is cancelled. This commit fixes this situation making task cancellation command to wait until it got propagated to all nodes that have child tasks. Closes #21126	2016-11-10 22:43:43 -05:00
Igor Motov	06a50fa31e	ShardActiveResponseHandler shouldn't hold to an entire cluster state ShardActiveResponseHandler doesn't need to hold to an entire cluster state since it only needs to know the cluster state version. It seems that on overloaded systems where nodes are unresponsive holding onto a lot of different cluster states can make the situation worse. Closes #21394	2016-11-10 22:28:49 -05:00
Ali Beyad	3001b636db	Ensures cleanup of temporary index-* generational blobs during snapshotting (#21469 ) Ensures pending index-* blobs are deleted when snapshotting. The index-* blobs are generational files that maintain the snapshots in the repository. To write these atomically, we first write a `pending-index-` blob, then move it to `index-`, which also deletes `pending-index-` in case its not a file-system level move (e.g. S3 repositories) . For example, to write the 5th generation of the index blob for the repository, we would first write the bytes to `pending-index-5` and then move `pending-index-5` to `index-5`. It is possible that we fail after writing `pending-index-5`, but before moving it to `index-5` or deleting `pending-index-5`. In this case, we will have a dangling `pending-index-5` blob laying around. Since snapshot #5 would have failed, the next snapshot assumes a generation number of 5, so it tries to write to `index-5`, which first tries to write to `pending-index-5` before moving the blob to `index-5`. Since `pending-index-5` is leftover from the previous failure, the snapshot fails as it cannot overwrite this blob. This commit solves the problem by first, adding a UUID to the `pending-index-` blobs, and secondly, strengthen the logic around failure to write the `index-*` generational blob to ensure pending files are deleted on cleanup. Closes #21462	2016-11-10 21:45:02 -05:00
Ryan Ernst	48bfb142b9	Remove (again) test uses of onModule (#21414 ) This change was reverted after it caused random test failures. This was due to a copy/paste error in the original PR which caused the mock version of ClusterInfoService to be used whenever the mock ZenPing was used, and the real ClusterInfoService to be used when MockZenPing was not used.	2016-11-10 16:06:14 -08:00
Areek Zillur	7ed195fe93	[TEST] Add assertBusy when checking for pending operation counter after tests Currently, pending operations can complete after tests with disruption scheme completes. This commit waits for the pending operation counter to complete after the tests are run	2016-11-10 18:35:52 -05:00
Areek Zillur	5b4c3fb1ac	Revert "Add trace logging when aquiring and releasing operation locks for replication requests" This reverts commit `4e996ca9f5`.	2016-11-10 18:35:25 -05:00
Alexander Lin	0219a211d3	Allows multiple patterns to be specified for index templates (#21009 ) * Allows for an array of index template patterns to be provided to an index template, and rename the field from 'template' to 'index_pattern'. Closes #20690	2016-11-10 18:00:30 -05:00
Ali Beyad	5c4392e58a	[TEST] fixes rebalance single shard check as it isn't guaranteed that a rebalance makes sense and the method only tests if rebalance is allowed	2016-11-10 17:13:39 -05:00
Jason Tedor	179dd885e2	Avoid angering Log4j in TransportNodesActionTests When logging a mock exception, Log4j attempts to render the stack trace. On a mock exception, this will be null and Log4j will hit a NullPointerException. This NullPointerException will get recorded in the status logger buffer that we use to ensure that we do not having any misuses of Log4j in production code. This commit replaces the use of a mock exception with an actual exception to avoid angering the Log4j assertions in ESTestCase.	2016-11-10 16:08:08 -05:00

1 2 3 4 5 ...

6748 Commits