There are 3 tightly related bug fixes in these changes:
1) ConcurrentModificationExceptions were being thrown by some SimClusterStateProvider methods when
creating collections/replicas due to the use of ArrayLists nodeReplicaMap. These ArrayLists were changed
to use synchronizedList wrappers.
2) The Exceptions from #1 were being swallowed/hidden by code using SimCloudManager.submit() w/o checking
the result of the resulting Future object. (As a result, tests waiting for a particular ClusterShape
would timeout regardless of how long they waited.) To protect against "silent" failures like this,
this SimCloudManager.submit() has been updated to wrap all input Callables such that any uncaught errors
will be logged and "counted." SimSolrCloudTestCase will ensure a suite level failure if any such failures
are counted.
3) The changes in #2 exposed additional concurrency problems with the Callables involved in leader election:
These would frequently throw IllegalStateExceptions due to assumptions about the state/existence of
replicas when the Callables were created vs when they were later run -- notably a Callable may have been
created that held a reference to a Slice, but by the time that Callable was run the collection (or a
node, etc...) refered to by that Slice may have been deleted. While fixing this, the leader election
logic was also cleaned up such that adding a replica only triggers leader election for that shard, not
every shard in the collection.
While auditing this code, cleanup was also done to ensure all usage of SimClusterStateProvider.lock was
also cleaned up to remove all risky points where an exception may have been possible after aquiring the
lock but before the try/finally that ensured it would be unlocked.
default inside of the 'expr' parameter, add InjectionDefense class
for safer handling of untrusted data in streaming expressions and add
-DStreamingExpressionMacros system property to revert to legacy behavior
Solr 7.5 enabled autoscaling based replica placement by default but in the absence of default cluster policies, autoscaling can place more than 1 replica of the same shard on the same node. Also, the maxShardsPerNode and createNodeSet was not respected. Due to these reasons, this issue reverts the default replica placement policy to the 'legacy' assignment policy that was the default until Solr 7.4.
Prior to this commit, new ZK nodes being simulated by the sim framework
were started with a version of -1. This causes problems, since -1 is
also coincidentally the flag value used to ignore optimistic concurrency
locking and force overwrite values.
SOLR-12804: Remove static modifier from Overseer queue access.
SOLR-12896: Introduce more checks for shutdown and closed to improve clean close and shutdown. (Partial)
SOLR-12897: Introduce AlreadyClosedException to clean up silly close / shutdown logging. (Partial)
SOLR-12898: Replace cluster state polling with ZkStateReader#waitFor. (Partial)
SOLR-12923: The new AutoScaling tests are way too flaky and need special attention. (Partial)
SOLR-12932: ant test (without badapples=false) should pass easily for developers. (Partial)
SOLR-12933: Fix SolrCloud distributed commit.
Recent JIRA's (SOLR-12947, SOLR-12965) have added support making it
easier to compose JSON query/faceting requests using SolrJ. But neither
made parsing the responses to these queries any easier.
This commit introduces NestableJsonFacet (along with several companion
types) which are Java representations of the JSON faceting response.
They can be accessed via the new QueryResponse method:
`getJsonFacetingResponse()`.
JsonQueryRequest had `setQuery` methods that took in a query either as a
String or as a Map. But no such overload for MapWriter, a SolrJ
interface used to transmit Maps via "push writing" over the wire. This
commit adds an overload taking this type, so that users can specify
their queries this way as well.
This commit also changes JsonQueryRequest writes out the request, to
ensure it uses "push writing" in non-MapWriter cases as well.
The JSON request API is great, but it's hard to use from SolrJ. This
commit adds 'JsonQueryRequest', which makes it much easier to write
JSON API requests in SolrJ applications.
This test too makes assumptions about how replicas are placed. In the legacy assignment strategy, the replica of a given collection are spread equally across all nodes but with the new policy based strategy, all cores across collections are spread out. Therefore the assumptions in this test were wrong. I've changed this test to use the legacy assignment policy because testing the autoAddReplicas feature doesn't have to depend on new replica assignment strategies. This change also fixes a bug in Assign which used "collection" key instead of "cluster" to figure out which strategy to use.
The testNonRetryableRequests test makes an assumption that a collection's replicas are equally distributed among all nodes but with the policy engine it is not true. Instead the policy engine spreads out the cores belonging to all collections equally among all nodes. This is fixed by only creating the collection needed by tests in this class just-in-time.
Previously, the maxShardsPerNode parameter was not allowed on collections when autoscaling policy was configured. Also if an autoscaling policy was configured then the default was to set an unlimited maxShardsPerNode automatically. Now the maxShardsPerNode parameter is always allowed during collection creation and maxShardsPerNode should be set correctly (if required) regardless of whether autoscaling policies are in effect or not. The default value of maxShardsPerNode continues to be 1 as before. It can be set to -1 during collection creation to fall back to the old behavior of unlimited maxShardsPerNode when using autoscaling policy. This patch also fixes PolicyHelper to find the free disk space requirements of a new replica from the leader only if said leader node is alive.
ConcurrentUpdateSolrClient can batch together many documents when making
an indexing request to Solr. When adding an update request to the
current batch being made, it checks that the query-parameters of the
docs being added match those already in the batch. But prior to this
commit it never checked that the collections/cores were the same.
This could result in documents being sent to the wrong collection if the
same client is used to index documents to two different
cores/collections simultaneously.
This commit addresses this problem, ensuring that documents aren't added
to a batch directed at a different core/collection.
The cluster wide defaults structure has changed from {collectionDefaults: {nrtReplicas : 2}} to {defaults : {collection : {nrtReplicas : 2}}}. The old format continues to be supported and can be read from ZK as well as written using the V2 set-obj-property syntax but it is deprecated and will be removed in Solr 9. We recommend that users change their API calls to use the new format going forward.
This commit deprecates the min_rf parameter. Solr now always includes the achieved replication
factor in the update requests (as if min_rf was always specified). Also, reverts the changes
introduced in SOLR-8034, replicas that don't ack an update will have to recover to prevent
inconsistent shards.
Now, assignment is done with the help of a builder class instead of calling a method with large number of arguments. The number of special cases that had to be handled have been cut down as well.
The API now supports 'nrtReplicas', 'tlogReplicas', 'pullReplicas' parameters as well 'createNodeSet' parameter. As part of this change, the CREATESHARD API now delegates placing replicas entirely to the ADDREPLICA command and uses the new parameters to add all the replicas in one API call.
Simplified test utility TrackingUpdateProcessorFactory.
Reverted some attempts the TRA used to make in avoiding overseer communication (too complicated).
Closes#433
Cluster properties restriction of known keys only is relaxed, and now unknown properties starting with "ext."
will be allowed. This allows custom to plugins set their own cluster properties.
This commit adds support for preferredOperation configuration parameter which defaults to movereplica. Changes ComputePlanAction to add all (collection,shard) pair as hints to AddReplicaSuggester when addreplica is selected as the preferred operation.
Fixed SolrDocument's confusion of field-attached child documents in addField()
Fixed AtomicUpdateDocumentMerger's confusion of field-attached child documents in isAtomicUpdate()
A collection may be co-located with another collection during collection creation time by specifying a
'withCollection' parameter. It can also be co-located afterwards by using the modify collection API.
The co-location guarantee is enforced regardless of future cluster operations whether they are invoked
manually via the Collection API or automatically by the Autoscaling framework.
Squashed commit of the following:
commit 3827703b38c598f1247c90ab57d3d640ab3a9e21
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Sat Jul 28 11:54:10 2018 +0530
SOLR-11990: Added change log entry
commit 7977222e07ba47274062cb8d8a69e7956d644000
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Sat Jul 28 11:52:17 2018 +0530
SOLR-11990: Added change log entry
commit 1857075fdb9d535b6149ad4369fed8b64b0c01f6
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Sat Jul 28 11:49:51 2018 +0530
SOLR-11990: Added note about co-location guarantees being one way only
commit 8557cbc8a511f21d1fcad99e11ea9d2104d0bef4
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Sat Jul 28 10:43:37 2018 +0530
SOLR-11990: Remove unused import
commit 864b013fd744edca9b6b84a8a7573fab3c5310d5
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Sat Jul 28 10:21:59 2018 +0530
SOLR-11990: Fixing compilation issues after merging master
commit dd840a2f7e765ee96c899d4d9ea89b6b67c5ae62
Merge: bb4ffb3 828d281
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Sat Jul 28 10:03:50 2018 +0530
Merge branch 'master' into jira/solr-11990
# Conflicts:
# solr/solr-ref-guide/src/collections-api.adoc
# solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/Clause.java
# solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/Suggestion.java
commit bb4ffb32c4960a2809ac8927e214e1e012204a73
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Fri Jul 27 14:09:44 2018 +0530
SOLR-11990: Ensure that the suggestion are validated by the policy engine otherwise move to the next candidate replica or the next candidate node
commit a97d45b22f9c232e939f979502c761001be9ae24
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Fri Jul 27 13:22:10 2018 +0530
SOLR-11990: Autoscaling suggestions for withCollection violations should prefer moving replicas before adding replicas
commit 7b5a84338dfe7335599a5e96aff2d26cb4eeaac6
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Fri Jul 27 12:22:45 2018 +0530
SOLR-11990: Fix statement about the behavior of the modify collection API when modifying the withCollection parameter
commit 63aec4fe0de7025c16b6ebc47dad1004531ecee1
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Thu Jul 26 07:29:07 2018 +0530
SOLR-11990: Added new page to the reference guide describing how to colocate collections together including guarantees and limitations
commit 6bfcd0786bb30353de9c26a01ec97ce3191b58f8
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Wed Jul 25 21:42:25 2018 +0530
SOLR-11990: Added another test which creates two collections which are colocated with two different collections and ensures that create collection and add replica operations work correctly
commit 4cead778f0044b6fb4012b085abf7b60350f495b
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Wed Jul 25 21:07:47 2018 +0530
SOLR-11990: Stop or start jettys in test setup to ensure that we always have exactly 2 replicas running before a test starts
commit 70dbfd042c2164fcd76d406eeab1518e4d3147fb
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Wed Jul 25 19:19:07 2018 +0530
SOLR-11990: Added description of the new withCollection parameter in the reference guide
commit 9d8260852b9d667d4d8e026432fd7727b7789393
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Wed Jul 25 19:16:46 2018 +0530
SOLR-11990: Reset count down latch during test setup
commit ae508165571b1afde54337859b8d5fdbb1d67312
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Wed Jul 25 15:43:54 2018 +0530
SOLR-11990: Add support for withCollection in simulated create collection API
commit 84f026b8c4cc25edb548430b8f5ad09d2486b3b5
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Tue Jul 24 17:21:33 2018 +0530
SOLR-11990: Ported the refactoring made in CreateCollectionCmd to the simulated version so that simulation tests are able to create collections correctly
commit defe111c9d31c8e4f0f00b4f2f3c875f5b2fa602
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Tue Jul 24 16:17:52 2018 +0530
SOLR-11990: Add missing javadoc for return statement
commit 8e47d5bc4545548c5441909c3fcc1a7901b38185
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Tue Jul 24 16:11:45 2018 +0530
SOLR-11990: Replace usage of forbidden Charsets with StandardCharsets class
commit 2d1b9eb25ea96a3a42c000ae654400ed44c17554
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Tue Jul 24 16:07:36 2018 +0530
SOLR-11990: Extract ConditionType to an interface VarType along with a WithCollectionVarType implementation
commit 1de2a4f52a59afca28de75bfa5156a3d6567a4f5
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Tue Jul 24 12:53:26 2018 +0530
SOLR-11990: Pass strict-ness parameter to the ConditionType so that WITH_COLLECTION can choose not to project add replica in strict mode.
This ensures that add replica or move replica suggesters always choose nodes that already have withCollection replicas first unless there are violations in doing so. Only if the first pass fails to find a suitable replica, do we go to the other nodes in the cluster. This also removes the need for the majority of changes in AddReplicaSuggester and so they've been reverted.
commit 0d616ed9e9bad791548c87086cba7760d724350d
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Tue Jul 24 11:36:34 2018 +0530
SOLR-11990: Minor changes to formatting and code comments
commit 1228538f934f35f15797d89c2c66f2deb9cddd8c
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Mon Jul 23 14:26:19 2018 +0530
SOLR-11990: Added a test which simulates a lost node and asserts that move replica suggester moves the replica on the lost node to a node already having the withCollection present
commit 582f1fd98de93ab73c74a1f623749dd031beb381
Author: Noble Paul <noble@apache.org>
Date: Mon Jul 23 18:35:22 2018 +1000
SOLR-11990: NPE removing unnecessary System.out.println
commit 501bc6c1d066321b344bbb8b1de3c2ead52f8c49
Author: Noble Paul <noble@apache.org>
Date: Mon Jul 23 18:31:07 2018 +1000
SOLR-11990: NPE during class init
commit acbf4a69321e16cff11cc7cf0a1f076fd9ac0037
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Mon Jul 23 13:55:30 2018 +0530
SOLR-11990: Added asserts on the nodes that should be selected by the add replica suggester
commit 4824933fd6eb7d1773acbff1a1a0c5e670226e0b
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Fri Jul 20 14:30:52 2018 +0530
SOLR-11990: Added WITH_COLLECTION to global tags. Fixed implementation of addViolatingReplicas and getSuggestions in the clause impl. Added more asserts in testWithCollectionSuggestions.
commit dbadb33211c190026e08d8e3ea587b6f8df8720b
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Fri Jul 20 13:44:36 2018 +0530
SOLR-11990: Added support for comparing violations, generating suggestions and adding violating replicas
commit ada1f17d5c93a4186260473e4822d2bee1da0e16
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Wed Jul 18 19:14:56 2018 +0530
SOLR-11990: Fix mock node state provider in TestPolicy to use the right cluster state. Added nocommits to ensure that we return the right suggestions for this feature.
commit ef2d61812e0d96eb2275b3411906d9de57ab835e
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Wed Jul 18 18:39:51 2018 +0530
SOLR-11990: Add missing node in nodeValues configuration
commit 34841fc01fea4a9f1e6a9f64050e576f2247a72b
Author: Shalin Shekhar Mangar <shalin@apache.org>
Date: Wed Jul 18 16:32:57 2018 +0530
SOLR-11990: Make it possible to co-locate replicas of multiple collections together in a node
LUCENE-8345, GitHub PR #392: Remove instantiation of redundant wrapper classes for primitives; add wrapper class constructors to forbiddenapis.
This closes#392