OpenSearch

History

David Turner 389f7779e7 Report more details of unobtainable ShardLock (#61255 ) Today a common reason for a `ShardLockObtainFailedException` is when a shard is removed from a node and then assigned straight back to it again before the node has had a chance to shut the previous shard instance down. For instance, this can happen if a node briefly leaves the cluster holding a primary with no in-sync replicas. The message in this case is typically as follows: obtaining shard lock timed out after 5000ms, previous lock details: [shard creation] trying to lock for [shard creation] This is pretty hard to interpret, and doesn't raise the important question: "why didn't the shard shut down sooner?" With this change we reword the message a bit, report the age of the shard lock, and adjust the details to report that the lock is held by a closing shard: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [12345ms] Relates #38807		2020-08-19 06:36:28 +01:00
..
licenses	upgrade to lucene-8.6.0 release (#59596 ) (#59599 )	2020-07-15 12:40:57 +02:00
src	Report more details of unobtainable ShardLock (#61255 )	2020-08-19 06:36:28 +01:00
build.gradle	Replace immediate task creations by using task avoidance api (#60071 ) (#60504 )	2020-07-31 13:09:04 +02:00