This primarily affected tests, but may have also caused odd errors/delays when restart/shutting down solr nodes.
(cherry picked from commit 980fd7d5c56349570b2202dedcf7b454216a61b8)
There are 3 tightly related bug fixes in these changes:
1) ConcurrentModificationExceptions were being thrown by some SimClusterStateProvider methods when
creating collections/replicas due to the use of ArrayLists nodeReplicaMap. These ArrayLists were changed
to use synchronizedList wrappers.
2) The Exceptions from #1 were being swallowed/hidden by code using SimCloudManager.submit() w/o checking
the result of the resulting Future object. (As a result, tests waiting for a particular ClusterShape
would timeout regardless of how long they waited.) To protect against "silent" failures like this,
this SimCloudManager.submit() has been updated to wrap all input Callables such that any uncaught errors
will be logged and "counted." SimSolrCloudTestCase will ensure a suite level failure if any such failures
are counted.
3) The changes in #2 exposed additional concurrency problems with the Callables involved in leader election:
These would frequently throw IllegalStateExceptions due to assumptions about the state/existence of
replicas when the Callables were created vs when they were later run -- notably a Callable may have been
created that held a reference to a Slice, but by the time that Callable was run the collection (or a
node, etc...) refered to by that Slice may have been deleted. While fixing this, the leader election
logic was also cleaned up such that adding a replica only triggers leader election for that shard, not
every shard in the collection.
While auditing this code, cleanup was also done to ensure all usage of SimClusterStateProvider.lock was
also cleaned up to remove all risky points where an exception may have been possible after aquiring the
lock but before the try/finally that ensured it would be unlocked.
(cherry picked from commit 76babf876a49f82959cc36a1d7ef922a9c2dddff)
default inside of the 'expr' parameter, add InjectionDefense class
for safer handling of untrusted data in streaming expressions and add
-DStreamingExpressionMacros system property to revert to legacy behavior
(cherry picked from commit 9edc557f4526ffbbf35daea06972eb2c595e692b)
Solr 7.5 enabled autoscaling based replica placement by default but in the absence of default cluster policies, autoscaling can place more than 1 replica of the same shard on the same node. Also, the maxShardsPerNode and createNodeSet was not respected. Due to these reasons, this issue reverts the default replica placement policy to the 'legacy' assignment policy that was the default until Solr 7.4.
(cherry picked from commit 7e2d40197cb096fe0519652c2ebbbf38a70d0d65)
Prior to this commit, new ZK nodes being simulated by the sim framework
were started with a version of -1. This causes problems, since -1 is
also coincidentally the flag value used to ignore optimistic concurrency
locking and force overwrite values.