OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-10 15:05:33 +00:00

Author	SHA1	Message	Date
Ryan Ernst	fb77d8f461	Removed writeTo from TransportResponse and ActionResponse (#44092 ) The base classes for transport requests and responses currently implement Streamable and Writeable. The writeTo method on these base classes is implemented with an empty implementation. Not only does this complicate subclasses to think they need to call super.writeTo, but it also can lead to not implementing writeTo when it should have been implemented, or extendiong one of these classes when not necessary, since there is nothing to actually implement. This commit removes the empty writeTo from these base classes, and fixes subclasses to not call super and in some cases implement an empty writeTo themselves. relates #34389	2019-07-10 12:42:04 -07:00
Zachary Tong	92ad588275	Remove generic on AggregatorFactory (#43664 ) (#44079 ) AggregatorFactory was generic over itself, but it doesn't appear we use this functionality anywhere (e.g. to allow the super class to declare arguments/return types generically for subclasses to override). Most places use a wildcard constraint, and even when a concrete type is specified it wasn't used. But since AggFactories are widely used, this led to the generic touching many pieces of code and making type signatures fairly complex	2019-07-10 13:20:28 -04:00
Nhat Nguyen	b158919542	Do not use mock engine in PrimaryAllocationIT (#44083 ) PrimaryAllocationIT#testForceStaleReplicaToBePromotedToPrimary relies on the flushing when a shard is no long assigned. This behavior, however, can be randomly disabled in MockInternalEngine. Closes #44049	2019-07-10 12:26:34 -04:00
David Turner	d0f1a756d9	Comment on the extra reroute after failing shards (#44152 ) The `ShardFailedClusterStateTaskExecutor` fails some shards, which performs a reroute, but then sometimes schedules a followup reroute. It's not clear from the code why this followup is necessary, so this commit adds a short comment describing why it's necessary.	2019-07-10 13:24:21 +01:00
David Roberts	cad804df92	[TEST] Mute ShrinkIndexIT Due to https://github.com/elastic/elasticsearch/issues/44164	2019-07-10 13:22:25 +01:00
Martijn van Groningen	913b6a64e8	Replace Streamable w/ Writable for MultiSearchRequest (#44057 ) This commit replaces usages of Streamable with Writeable for the MultiSearchRequest class. I ran into this when developing a custom action that reuses MultiSearchRequest in the enrich branch. Relates to #34389	2019-07-10 11:13:28 +02:00
Armin Braun	a23d1ed00d	Mute SearchWithRandomExceptionsIT (#44147 ) (#44149 ) * This is failing quiete often and we can reproduce it now so we don't need additional test logging on CI * Relates #40435	2019-07-10 08:12:26 +02:00
David Turner	aec44fecbc	Decouple DiskThresholdMonitor & ClusterInfoService (#44105 ) Today the `ClusterInfoService` requires the `DiskThresholdMonitor` at construction time so that it can notify it when nodes report changes in their disk usage, but this is awkward to construct: the `DiskThresholdMonitor` requires a `RerouteService` which requires an `AllocationService` which comees from the `ClusterModule` which requires the `ClusterInfoService`. Today we break the cycle with a `LazilyInitializedRerouteService` which is itself a little ugly. This commit replaces this with a more traditional subject/observer relationship between the `ClusterInfoService` and the `DiskThresholdMonitor`.	2019-07-09 18:43:32 +01:00
David Turner	e70cad4c52	Remove node conn block after connection barrier (#44114 ) Today `testOnlyBlocksOnConnectionsToNewNodes` fails (extremely rarely) if the last attempt to connect to `node0` is delayed for so long that the test runs `nodeConnectionsBlocks.clear()` before the connection attempt obtains the expected connection block. We can turn this into a reliable failure with this delay: ```diff diff --git a/server/src/main/java/org/elasticsearch/cluster/NodeConnectionsService.java b/server/src/main/java/org/elasticsearch/cluster/NodeConnectionsService.java index f48413824d3..9a1d0336bcd 100644 --- a/server/src/main/java/org/elasticsearch/cluster/NodeConnectionsService.java +++ b/server/src/main/java/org/elasticsearch/cluster/NodeConnectionsService.java @@ -300,6 +300,13 @@ public class NodeConnectionsService extends AbstractLifecycleComponent { private final Runnable connectActivity = () -> threadPool.executor(ThreadPool.Names.MANAGEMENT).execute(new AbstractRunnable() { @Override protected void doRun() { + + try { + Thread.sleep(500); + } catch (InterruptedException e) { + throw new AssertionError("unexpected", e); + } + assert Thread.holdsLock(mutex) == false : "mutex unexpectedly held"; transportService.connectToNode(discoveryNode); consecutiveFailureCount.set(0); ``` This commit reverts the extra logging introduced in #43979 and fixes this failure by waiting for the connection attempt to hit the barrier before removing it. Fixes #40170	2019-07-09 17:03:26 +01:00
David Turner	268971db03	Wait for blackholed connection before discovery (#44077 ) Since #42636 we no longer treat connections specially when simulating a blackholed connection. This means that at the end of the safety phase we may have just started a connection attempt which will time out, but the default timeout is 30 seconds, much longer than the 2 seconds we normally allow for post-safety-phase discovery. This commit adds time for such a connection attempt to time out. It also fixes some spurious logging of `this` that now refers to an object with an unhelpful `toString()` implementation introduced in #42636. Fixes #44073	2019-07-09 10:59:53 +01:00
Henning Andersen	748a10866d	Reindex ScrollableHitSource pump data out (#43864 ) Refactor ScrollableHitSource to pump data out and have a simplified interface (callers should no longer call startNextScroll, instead they simply mark that they are done with the previous result, triggering a new batch of data). This eases making reindex resilient, since we will sometimes need to rerun search during retries. Relates #43187 and #42612	2019-07-09 11:50:09 +02:00
David Turner	fd9eebae81	Only apply initial recovery filter to shrunk shard (#44054 ) Today the `index.routing.allocation.initial_recovery._id` setting can only be set on indices that are the result of a shrink, but the filtered allocation decider also applies this filter to shards with a recovery source of `EMPTY_STORE`. The only way to have this setting set while the recovery source is `EMPTY_STORE` is to force-allocate an empty primary, but such a forced allocation ignores this allocation decider. This commit simplifies the allocation decider so that the `initial_recovery` setting only applies to shards with a recovery source of `LOCAL_SHARDS`.	2019-07-09 08:42:18 +01:00
Armin Braun	9eac5ceb1b	Dry up inputstream to bytesreference (#43675 ) (#44094 ) * Dry up Reading InputStream to BytesReference * Dry up spots where we use the same pattern to get from an InputStream to a BytesReferences	2019-07-09 09:18:25 +02:00
Armin Braun	dc8f8e40eb	Fix DedicatedClusterSnapshotRestoreIT testSnapshotWithStuckNode (#43537 ) (#44082 ) * Fix DedicatedClusterSnapshotRestoreIT testSnapshotWithStuckNode * See comment in the test: The problem is that when the snapshot delete works out partially on master failover and the retry fails on `SnapshotMissingException` no repository cleanup is run => we still failed even with repo cleanup logic in the delete path now * Fixed the test by rerunning a create snapshot and delete loop to clean up the repo before verifying file counts * Closes #39852	2019-07-09 06:32:08 +02:00
Armin Braun	03332b5aeb	Don't Consistency Check Broken Repository in Test (#43499 ) (#44071 ) * Missed this one in #42189 and it randomly runs into a situation where the broken mock repo is broken such that we can't get to a consistent end state via a delete * Closes #43498	2019-07-08 17:21:40 +02:00
Tanguy Leroux	251287f89d	Check again on-going snapshots/restores of indices before closing (#43873 ) Today we prevent any index that is actively snapshotted or restored to be closed. This verification is done during the execution of the first phase of index closing (ie before blocking the indices). We should also do this verification again in the last phase of index closing (ie after the shard sanity checks and right before actually changing the index state and the routing table) because a snapshot/restore could sneak in while the shards are verified-before-close.	2019-07-08 17:07:04 +02:00
Mark Tozzi	299a52c17d	Enable validating user-supplied missing values on unmapped fields (#43718 ) (#43940 ) Provides a hook for aggregations to introspect the `ValuesSourceType` for a user supplied Missing value on an unmapped field, when the type would otherwise be `ANY`. Mapped field behavior is unchanged, and still applies the `ValuesSourceType` of the field. This PR just provides the hook for doing this, no existing aggregations have their behavior changed.	2019-07-08 10:46:23 -04:00
Armin Braun	2918363e90	Simplify BlobStoreRepository (Flatten Nested Classes) (#42833 ) (#44060 ) * In the current codebase it is hardly obvious what code operates on a shard and is run by a datanode what code operates on the global metadata and is run on master * Fixed by adjusting the method names accordingly * The nested context classes don't add much if any value, they simply spread out the parameters that go into a shard snapshot create or delete all over the place since their constructors can be inlined in all spots * Fixed by flattening the nested classes into BlobStoreRepository * Also: * Inlined the other single use inner classes	2019-07-08 14:57:27 +02:00
Armin Braun	afe81fd625	Some Cleanup in Test Framework (#44039 ) (#44059 ) * Remove some obvious dead code * Move assert methods that were only used in a single test class to the child they belong to * Inline some redundant methods	2019-07-08 14:15:31 +02:00
David Turner	3f3bcb23c2	AwaitsFix testForceStaleReplicaToBePromotedToPrimary Relates #44049	2019-07-08 11:26:57 +01:00
David Turner	3129f5b42e	Do not copy initial recovery filter during split (#44053 ) If an index is the result of a shrink then it will have a value set for `index.routing.allocation.initial_recovery._id`. If this index is subsequently split then this value will be copied over, forcing the initial allocation of the split shards to occur on the node on which the shrink took place. Moreover if this node no longer exists then the split will fail. This commit suppresses the copying of this setting when splitting an index. Fixes #43955	2019-07-08 10:32:05 +01:00
Armin Braun	af9b98e81c	Recursively Delete Unreferenced Index Directories (#42189 ) (#44051 ) * Use ability to list child "folders" in the blob store to implement recursive delete on all stale index folders when cleaning up instead of using the diff between two `RepositoryData` instances to cover aborted deletes * Runs after ever delete operation * Relates #13159 (fixing most of this issues caused by unreferenced indices, leaving some meta files to be cleaned up only)	2019-07-08 10:55:39 +02:00
Przemyslaw Gomulka	247f2dabad	Fix decimal point parsing for date_optional_time backport(#43859 ) #44050 Joda allowed for date_optional_time and strict_date_optional_time a decimal point to be . dot or , comma For our java.time implementation we should also extend this for strict_date_optional_time-nanos the approach to fix this is the same as in iso8601 parser closes #43730	2019-07-08 09:56:01 +02:00
Armin Braun	f6efc55556	Fix SnapshotResiliencyTest (#44015 ) (#44041 ) * Closes #43989	2019-07-07 19:59:16 +02:00
Armin Braun	990ac4ca83	Some Cleanup in BlobStoreRepository (#43323 ) (#44043 ) * Some Cleanup in BlobStoreRepository * Extracted from #42833: * Dry up index and shard path handling * Shorten XContent handling	2019-07-07 19:50:46 +02:00
Nhat Nguyen	9089820d8f	Enable indexing optimization using sequence numbers on replicas (#43616 ) This PR enables the indexing optimization using sequence numbers on replicas. With this optimization, indexing on replicas should be faster and use less memory as it can forgo the version lookup when possible. This change also deactivates the append-only optimization on replicas. Relates #34099	2019-07-05 22:12:08 -04:00
Yannick Welsch	504a43d43a	Move ConnectionManager to async APIs (#42636 ) This commit converts the ConnectionManager's openConnection and connectToNode methods to async-style. This will allow us to not block threads anymore when opening connections. This PR also adapts the cluster coordination subsystem to make use of the new async APIs, allowing to remove some hacks in the test infrastructure that had to account for the previous synchronous nature of the connection APIs.	2019-07-05 20:40:22 +02:00
Yannick Welsch	88783927d1	Weaken assertion in PublicationTransportHandler (#44014 ) These assertions do not hold true when a master fails during publication and quickly becomes master again, publishing a new cluster state in a higher term which races against the previous cluster state publication to self (which does not matter anyway). Relates #43994 Closes #44012	2019-07-05 18:27:42 +02:00
Yannick Welsch	1220ff5b6d	Publish to self through transport (#43994 ) This commit ensures that cluster state publications to self also go through the transport layer. This allows voting-only nodes to intercept the publication to self. Fixes an issue discovered by a test failure where a voting-only node, which was the only bootstrapped node, would not step down as master after state transfer because publishing to self would succeed. Closes #43631	2019-07-05 13:00:52 +02:00
Yannick Welsch	5cdf3ff3fa	Revert "[TEST] Mute RemoteClusterServiceTests.testCollectNodes" This reverts commit d8a2970fa40d34676242d520502fcd00a2a2fbfa.	2019-07-05 11:02:42 +02:00
David Turner	06df0c0a4c	Improve RetentionLease(Bgrd)SyncAction#toString() (#43987 ) Today `RetentionLeaseSyncAction.Request` and `RetentionLeaseBackgroundSyncAction.Request` both describe themselves as `Request{...}` in the value returned from their respective `toString()` methods. This commit adds the name of the owning class to both so we have something a bit easier to search for and so we can distinguish foreground from background syncs in logs and test failures and so on.	2019-07-05 09:58:35 +01:00
David Turner	435a83f3fd	Add more logging to testOnlyBlocksOnConnectionsToNewNodes (#43979 ) Some more output from this occasionally-failing test tracked in #40170.	2019-07-05 09:54:48 +01:00
Jim Ferenczi	cdf55cb5c5	Refactor index engines to manage readers instead of searchers (#43860 ) This commit changes the way we manage refreshes in the index engines. Instead of relying on a SearcherManager, this change uses a ReaderManager that creates ElasticsearchDirectoryReader when needed. Searchers are now created on-demand (when acquireSearcher is called) from the current ElasticsearchDirectoryReader. It also slightly changes the Engine.Searcher to extend IndexSearcher in order to simplify the usage in the consumer.	2019-07-04 22:49:43 +02:00
Christoph Büscher	aeb3c1fd1b	Prevent types deprecation warning for indices.exists requests (#43963 ) Currently we log a deprecation warning to the types removal in RestGetIndicesAction even if the REST method is HEAD, which is used by the indices.exists API. Since the body is empty in this case we should not need to show the deprecation warning. Closes #43905	2019-07-04 17:20:43 +02:00
Tanguy Leroux	b037aeaa6e	Fix IndexShardIT.testIndexCanChangeCustomDataPath() (#43978 ) The test IndexShardIT.testIndexCanChangeCustomDataPath() fails on 7.x and 7.3 because the translog cannot be recovered. While I can't reproduce the issue, I think it has been introduced in #43752 which changed ReadOnlyEngine so that it opens the translog in its constructor in order to load the translog stats. This opening writes a new checkpoint file, but because 7.x/7.3 does not wait for shards to be started after being closed, the test immediately starts to copy shard files to a new directory and possibly does not copy all the required translog files. By waiting for the shards to be started after being closed, we ensure that the shards (and engines) have been correctly initialized and that the translog checkpoint file is not currently being written. closes #43964	2019-07-04 17:06:37 +02:00
Alan Woodward	4b99255fed	Add name() method to TokenizerFactory (#43909 ) This brings TokenizerFactory into line with CharFilterFactory and TokenFilterFactory, and removes the need to pass around tokenizer names when building custom analyzers. As this means that TokenizerFactory is no longer a functional interface, the commit also adds a factory method to TokenizerFactory to make construction simpler.	2019-07-04 11:28:55 +01:00
Jim Ferenczi	2cc0a56fe6	Fix wrong logic in `match_phrase` query with multi-word synonyms (#43941 ) Disjunction over two individual terms in a phrase query with multi-word synonyms wrongly applies a prefix query to each of these terms. This change fixes this bug by inversing the logic to use prefixes on `phrase_prefix` queries only. Closes #43308	2019-07-04 09:39:39 +02:00
Henning Andersen	cacc3f7ff8	Async IO Processor release before notify (#43682 ) This commit changes async IO processor to release the promiseSemaphore before notifying consumers. This ensures that a bad consumer that sometimes does blocking (or otherwise slow) operations does not halt the processor. This should slightly increase the concurrency for shard fsync, but primarily improves safety so that one bad piece of code has less effect on overall system performance.	2019-07-04 06:33:38 +02:00
Igor Motov	c593085104	Geo: Refactors libs/geo parser to provide serialization logic as well (#43717 ) Enables libs/geo parser to return a geometry format object that can perform both serialization and deserialization functions. This can be useful for ingest nodes that are trying to modify an existing geometry in the source. Relates to #43554	2019-07-03 19:31:44 -04:00
Adrien Grand	680edbe3f1	Bump current version to 7.4. (#43927 )	2019-07-03 20:32:04 +02:00
Armin Braun	be20fb80e4	Recursive Delete on BlobContainer (#43281 ) (#43920 ) This is a prerequisite of #42189: * Add directory delete method to blob container specific to each implementation: * Some notes on the implementations: * AWS + GCS: We can simply exploit the fact that both AWS and GCS return blobs lexicographically ordered which allows us to simply delete in the same order that we receive the blobs from the listing request. For AWS this simply required listing without the delimiter setting (so we get a deep listing) and for GCS the same behavior is achieved by not using the directory mode on the listing invocation. The nice thing about this is, that even for very large numbers of blobs the memory requirements are now capped nicely since we go page by page when deleting. * For Azure I extended the parallelization to the listing calls as well and made it work recursively. I verified that this works with thread count `1` since we only block once in the initial thread and then fan out to a "graph" of child listeners that never block. * HDFS and FS are trivial since we have directory delete methods available for them * Enhances third party tests to ensure the new functionality works (I manually ran them for all cloud providers)	2019-07-03 17:14:57 +02:00
Alan Woodward	49d69bf987	Actually close IndexAnalyzers contents (#43914 ) IndexAnalyzers has a close() method that should iterate through all its wrapped analyzers and close each one in turn. However, instead of delegating to the analyzers' close() methods, it instead wraps them in a Closeable interface, which just returns a list of the analyzers. In addition, whitespace normalizers are ignored entirely.	2019-07-03 16:06:58 +01:00
David Turner	9cecc31cdc	Shortcut simple patterns ending in `` (#43904 ) When profiling a call to `AllocationService#reroute()` in a large cluster containing allocation filters of the form `node-name-` I observed a nontrivial amount of time spent in `Regex#simpleMatch` due to these allocation filters. Patterns ending in a wildcard are not uncommon, and this change treats them as a special case in `Regex#simpleMatch` in order to shave a bit of time off this calculation. It also uses `String#regionMatches()` to avoid an allocation in the case that the pattern's only wildcard is at the start. Microbenchmark results before this change: Result "org.elasticsearch.common.regex.RegexStartsWithBenchmark.performSimpleMatch": 1113.839 ±(99.9%) 6.338 ns/op [Average] (min, avg, max) = (1102.388, 1113.839, 1135.783), stdev = 9.486 CI (99.9%): [1107.502, 1120.177] (assumes normal distribution) Microbenchmark results with this change applied: Result "org.elasticsearch.common.regex.RegexStartsWithBenchmark.performSimpleMatch": 433.190 ±(99.9%) 0.644 ns/op [Average] (min, avg, max) = (431.518, 433.190, 435.456), stdev = 0.964 CI (99.9%): [432.546, 433.833] (assumes normal distribution) The microbenchmark in question was: @Fork(3) @Warmup(iterations = 10) @Measurement(iterations = 10) @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.NANOSECONDS) @State(Scope.Benchmark) @SuppressWarnings("unused") //invoked by benchmarking framework public class RegexStartsWithBenchmark { private static final String testString = "abcdefghijklmnopqrstuvwxyz"; private static final String[] patterns; static { patterns = new String[testString.length() + 1]; for (int i = 0; i <= testString.length(); i++) { patterns[i] = testString.substring(0, i) + "*"; } } @Benchmark public void performSimpleMatch() { for (int i = 0; i < patterns.length; i++) { Regex.simpleMatch(patterns[i], testString); } } }	2019-07-03 14:15:27 +01:00
paulward24	cff027499a	Ensure to access RecoveryState#fileDetails under lock Closes #43840	2019-07-03 07:39:58 -04:00
Armin Braun	7059224668	Optimize Snapshot Finalization (#42723 ) (#43908 ) * Optimize Snapshot Finalization * Delete index-N blobs and segement blobs in one single bulk delete instead of in separate ones to save RPC calls on implementations that have bulk deletes implemented * Don't fail snapshot because deleting old index-N failed, this results in needlessly logging finalization failures and makes analysis of failures harder going forward as well as incorrect index.latest blobs	2019-07-03 13:26:35 +02:00
Armin Braun	455b12a4fb	Add Ability to List Child Containers to BlobContainer (#42653 ) (#43903 ) * Add Ability to List Child Containers to BlobContainer (#42653) * Add Ability to List Child Containers to BlobContainer * This is a prerequisite of #42189	2019-07-03 11:30:49 +02:00
Henning Andersen	cd2972239c	AsyncIOProcessor preserve thread context (#43729 ) AsyncIOProcessor now preserves thread context, ensuring that deprecation warnings are not duplicated to other concurrent operations on the same shard.	2019-07-03 10:22:20 +02:00
Jim Ferenczi	05c0cff1b6	Fix index_prefix sub field name on nested text fields (#43862 ) This change fixes the name of the index_prefix sub field when the `index_prefix` option is set on a text field that is nested under an object or a multi-field. We don't use the full path of the parent field to set the index_prefix field name so the field is registered under the wrong name. This doesn't break queries since we always retrieve the prefix field through its parent field but this breaks other APIs like _field_caps which tries to find the parent of the `index_prefix` field in the mapping but fails. Closes #43741	2019-07-03 09:50:52 +02:00
Armin Braun	826f38cd70	Enable Parallel Deletes in Azure Repository (#42783 ) (#43886 ) * Parallel deletes via private thread pool	2019-07-03 09:28:39 +02:00
Tanguy Leroux	365dfe88ca	Refresh translog stats after translog trimming in NoOpEngine (#43825 ) This commit changes NoOpEngine so that it refreshes its translog stats once translog is trimmed. Relates #43156	2019-07-03 08:49:14 +02:00

1 2 3 4 5 ...

3259 Commits