SOLR-10821: Ref guide documentation for Autoscaling

Squashed commit of the following: commit 4a8eb9491a1dc8099805656adec34197d0dab092 Author: Shalin Shekhar Mangar <shalin@apache.org> Date: Fri Aug 4 14:35:57 2017 +0530 SOLR-10821: Added note on using maxShardsPerNode along with a policy commit 2a9bb140e12a60f93a6314647cf7d4fdc7f4fe60 Author: Shalin Shekhar Mangar <shalin@apache.org> Date: Fri Aug 4 14:04:48 2017 +0530 SOLR-10821: Use availability zone as an example instead of region commit e5f0fe130ae7a269df7a3741c7ed7bf8b009c446 Author: Shalin Shekhar Mangar <shalin@apache.org> Date: Fri Aug 4 08:34:04 2017 +0530 SOLR-10821: Removed mention of triggers commit 876276626a90849068d5fd0000893ba3660ac687 Author: Shalin Shekhar Mangar <shalin@apache.org> Date: Fri Aug 4 08:32:29 2017 +0530 SOLR-10821: Added policy specifications and examples commit 245be9c44af7427fbde292120520f70ed54cadc9 Author: Shalin Shekhar Mangar <shalin@apache.org> Date: Fri Aug 4 08:12:09 2017 +0530 SOLR-10821: Added note on what happens when you change the policy/preferences commit 202fe3324748fdfb12d5ffbba60bd69c6aa768cb Author: Shalin Shekhar Mangar <shalin@apache.org> Date: Thu Aug 3 20:32:10 2017 +0530 SOLR-10821: Added specification for policy attributes and operators commit ccb6c559eb1ee080c5be06f1b471554d5038f699 Author: Shalin Shekhar Mangar <shalin@apache.org> Date: Thu Aug 3 13:14:41 2017 +0530 SOLR-10821: Added documentation on how to completely remove cluster preferences and policies commit 24e4827f2e482929546a6e0de447046f79e1510d Author: Shalin Shekhar Mangar <shalin@apache.org> Date: Tue Aug 1 10:00:15 2017 +0530 SOLR-10821: Added documentation for cluster preferences commit d77c4786909e406ba194ef7144e1abd38c8bce83 Author: Cassandra Targett <ctargett@apache.org> Date: Mon Jul 31 13:19:30 2017 -0500 SOLR-10821: standardize "autoscaling" spelling & headings; other small copy edits commit 4644e2963d8bb51aada46fc3c9180eec0bfdac12 Author: Shalin Shekhar Mangar <shalin@apache.org> Date: Mon Jul 31 17:15:29 2017 +0530 SOLR-10821: Added docs for the autoscaling write APIs commit c7c0c86a2e3e15e6aa4e986a865e73e596cf275e Author: Shalin Shekhar Mangar <shalin@apache.org> Date: Mon Jul 31 14:13:56 2017 +0530 Added docs for the autoscaling read and diagnostics API commit 1fd011cce3a97eb51597d5ca09c02ca5038c769b Author: Shalin Shekhar Mangar <shalin@apache.org> Date: Mon Jul 31 13:14:35 2017 +0530 SOLR-10821: Removed host/port and only show the path information in example commit 9199fa3432f8c5ab4df35dbe776aeb586baa60ec Author: Shalin Shekhar Mangar <shalin@apache.org> Date: Mon Jul 31 13:12:46 2017 +0530 SOLR-10821: Remove mention of triggers and listeners commit d7f7639fa1dbe3584f9f1dabb3663a5630b78fcb Author: Shalin Shekhar Mangar <shalin@apache.org> Date: Mon Jul 31 13:06:19 2017 +0530 SOLR-10821: First cut of nav structure and overview page for autoscaling features in 7.0
2025-03-08 17:49:29 +00:00 · 2017-08-05 08:42:16 +05:30 · 2017-08-05 08:42:16 +05:30 · 2e502c357c
commit 2e502c357c
parent f962effd12
6 changed files with 621 additions and 1 deletions
--- a/solr/CHANGES.txt
+++ b/solr/CHANGES.txt
@ -627,6 +627,8 @@ Other Changes
  
 * SOLR-10803: Mark all Trie/LegacyNumeric based fields @deprecated in Solr7.  (Steve Rowe)

+* SOLR-10821: Ref guide documentation for Autoscaling (Noble Paul, Cassandra Targett, shalin)
+
 ==================  6.7.0 ==================

 Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release.
--- a/solr/solr-ref-guide/src/solrcloud-autoscaling-api.adoc
+++ b/solr/solr-ref-guide/src/solrcloud-autoscaling-api.adoc
@ -0,0 +1,319 @@
+= SolrCloud Autoscaling API
+:page-shortname: solrcloud-autoscaling-api
+:page-permalink: solrcloud-autoscaling-api.html
+:page-toclevels: 2
+:page-tocclass: right
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+The Autoscaling API can be used to manage autoscaling policies and preferences, and to get diagnostics on the state of the cluster.
+
+== Read API
+
+The autoscaling Read API is available at `/admin/autoscaling` or `/v2/cluster/autoscaling`. It returns information about the configured cluster preferences, cluster policy and collection-specific policies.
+
+This API does not take any parameters.
+
+=== Read API Response
+
+The output will contain cluster preferences, cluster policy and collection specific policies.
+
+=== Examples using Read API
+
+*Output*
+
+[source,json]
+----
+{
+    "responseHeader": {
+        "status": 0,
+        "QTime": 2
+    },
+    "cluster-policy": [
+        {
+            "replica": "<2",
+            "shard": "#EACH",
+            "node": "#ANY"
+        }
+    ],
+    "WARNING": "This response format is experimental.  It is likely to change in the future."
+}
+----
+
+== Diagnostics API
+
+The diagnostics API shows the violations, if any, of all conditions in the cluster or collection-specific policy. It is available at the `/admin/autoscaling/diagnostics` path.
+
+This API does not take any parameters.
+
+=== Diagnostics API Response
+
+The output will contain `sortedNodes` which is a list of nodes in the cluster sorted according to overall load in descending order (as determined by the preferences) and `violations` which is a list of nodes along with the conditions that they violate.
+
+=== Examples Using Diagnostics API
+
+Here is an example with no violations but in the `sortedNodes` section, we can see that the first node is most loaded (according to number of cores):
+
+[source,json]
+----
+{
+    "responseHeader": {
+        "status": 0,
+        "QTime": 65
+    },
+    "diagnostics": {
+        "sortedNodes": [
+            {
+                "node": "127.0.0.1:8983_solr",
+                "cores": 3
+            },
+            {
+                "node": "127.0.0.1:7574_solr",
+                "cores": 2
+            }
+        ],
+        "violations": []
+    },
+    "WARNING": "This response format is experimental.  It is likely to change in the future."
+}
+----
+
+Suppose we added a condition to the cluster policy as follows:
+
+[source,json]
+----
+{"replica": "<2", "shard": "#EACH", "node": "#ANY"}
+----
+
+However, since the first node in the first example had more than 1 replica for a shard already, then the diagnostics API will return:
+
+[source,json]
+----
+{
+    "responseHeader": {
+        "status": 0,
+        "QTime": 45
+    },
+    "diagnostics": {
+        "sortedNodes": [
+            {
+                "node": "127.0.0.1:8983_solr",
+                "cores": 3
+            },
+            {
+                "node": "127.0.0.1:7574_solr",
+                "cores": 2
+            }
+        ],
+        "violations": [
+            {
+                "collection": "gettingstarted",
+                "shard": "shard1",
+                "node": "127.0.0.1:8983_solr",
+                "tagKey": "127.0.0.1:8983_solr",
+                "violation": {
+                    "replica": "2",
+                    "delta": 0
+                },
+                "clause": {
+                    "replica": "<2",
+                    "shard": "#EACH",
+                    "node": "#ANY",
+                    "collection": "gettingstarted"
+                }
+            }
+        ]
+    },
+    "WARNING": "This response format is experimental.  It is likely to change in the future."
+}
+----
+
+In the above example the node with port 8983 has two replicas for `shard1` in violation of our policy.
+
+== Write API
+
+The Write API is available at the same `/admin/autoscaling` and `/v2/cluster/autoscaling` endpoints as the read API but can only be used with the *POST* HTTP verb.
+
+The payload of the POST request is a JSON message with commands to set and remove components. Multiple commands can be specified together in the payload. The commands are executed in the order specified and the changes are atomic, i.e., either all succeed or none.
+
+=== set-cluster-preferences: Create and Modify Cluster Preferences
+
+The cluster preferences are specified as a list of sort preferences. Multiple sorting preferences can be specified and they are applied in order.
+
+Each preference is a JSON map having the following syntax:
+
+`{'<sort_order>': '<sort_param>', 'precision' : '<precision_val>'}`
+
+You can see the __TODO__ section to know more about the allowed values for the `sort_order`, `sort_param` and `precision` parameters.
+
+Changing the cluster preferences after the cluster is already built doesn't automatically reconfigure the cluster. However, all future cluster management operations will use the changed preferences.
+
+*Input*
+
+[source,json]
+----
+{
+    "set-cluster-preferences" : [
+      {"minimize": "cores"}
+	]
+}
+----
+
+*Output*
+
+The output has a key named `result` which will return either `success` or `failure` depending on whether the command succeeded or failed.
+
+[source,json]
+----
+{
+    "responseHeader": {
+        "status": 0,
+        "QTime": 138
+    },
+    "result": "success",
+    "WARNING": "This response format is experimental.  It is likely to change in the future."
+}
+----
+
+==== Example Setting Cluster Preferences
+
+In this example we add cluster preferences that sort on three different parameters:
+
+[source,json]
+----
+{
+  "set-cluster-preferences": [
+    {
+      "minimize": "cores",
+      "precision": 2
+    },
+    {
+      "maximize": "freedisk",
+      "precision": 100
+    },
+    {
+      "minimize": "sysLoadAvg",
+      "precision": 10
+    }
+  ]
+}
+----
+
+We can remove all cluster preferences by setting preferences to an empty list.
+[source,json]
+----
+{
+  "set-cluster-preferences": []
+}
+----
+
+=== set-cluster-policy: Create and Modify Cluster Policies
+
+You can see the __TODO__ section to know more about the allowed values for each condition in the policy.
+
+*Input*:
+[source,json]
+----
+{
+	"set-cluster-policy": [
+		{"replica": "<2", "shard": "#EACH", "node": "#ANY"}
+	]
+}
+----
+
+Output:
+[source,json]
+----
+{
+    "responseHeader": {
+        "status": 0,
+        "QTime": 47
+    },
+    "result": "success",
+    "WARNING": "This response format is experimental.  It is likely to change in the future."
+}
+----
+
+We can remove all cluster policy conditions by setting policy to an empty list.
+[source,json]
+----
+{
+  "set-cluster-policy": []
+}
+----
+
+Changing the cluster policy after the cluster is already built doesn't automatically reconfigure the cluster. However, all future cluster management operations will use the changed cluster policy.
+
+=== set-policy: Create and Modify Collection-Specific Policy
+
+This command accepts a map of policy name to the list of conditions for that policy. Multiple named policies can be specified together. A named policy that does not exist already is created and if the named policy accepts already then it is replaced.
+
+You can see the __TODO__ section to know more about the allowed values for each condition in the policy.
+
+*Input*
+
+[source,json]
+----
+{
+    "set-policy": {
+      "policy1": [
+        {"replica": "1", "shard": "#EACH", "port": "8983"}
+      ]
+  }
+}
+----
+
+*Output*
+
+[source,json]
+----
+{
+    "responseHeader": {
+        "status": 0,
+        "QTime": 246
+    },
+    "result": "success",
+    "WARNING": "This response format is experimental.  It is likely to change in the future."
+}
+----
+
+Changing the policy after the collection is already built doesn't automatically reconfigure the collection. However, all future cluster management operations will use the changed policy.
+
+=== remove-policy: Remove a Collection-Specific Policy
+
+This command accepts a policy name to be removed from Solr. The policy being removed must not be attached to any collection otherwise the command will fail.
+
+*Input*
+[source,json]
+----
+{"remove-policy": "policy1"}
+----
+
+*Output*
+[source,json]
+----
+{
+    "responseHeader": {
+        "status": 0,
+        "QTime": 42
+    },
+    "result": "success",
+    "WARNING": "This response format is experimental.  It is likely to change in the future."
+}
+----
+
+If you attempt to remove a policy that is being used by a collection then this command will fail to delete the policy until the collection itself is deleted.
--- a/solr/solr-ref-guide/src/solrcloud-autoscaling-overview.adoc
+++ b/solr/solr-ref-guide/src/solrcloud-autoscaling-overview.adoc
@ -0,0 +1,59 @@
+= Overview of Autoscaling in SolrCloud
+:page-shortname: solrcloud-autoscaling-overview
+:page-permalink: solrcloud-autoscaling-overview.html
+:page-toclevels: 1
+:page-tocclass: right
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Autoscaling in Solr aims to provide good defaults such that the cluster remains balanced and stable in the face of various events such as a node joining the cluster or leaving the cluster. This is achieved by satisfying a set of rules and sorting preferences that help Solr select the target of cluster management operations.
+
+== Cluster Preferences
+
+Cluster preferences, as the name suggests, apply to all cluster management operations regardless of which collection they affect.
+
+A preference is a set of conditions that help Solr select nodes that either maximize or minimize given metrics. For example, a preference `{minimize : cores}` will help Solr select nodes such that the number of cores on each node is minimized. We write cluster preference in a way that reduces the overall load on the system. You can add more than one preferences to break ties.
+
+The default cluster preferences consist of the above example (`{minimize : cores}`) which is to minimize the number of cores on all nodes.
+
+You can learn more about preferences in the __TODO__ section.
+
+== Cluster Policy
+
+A cluster policy is a set of conditions that a node, shard, or collection must satisfy before it can be chosen as the target of a cluster management operation. These conditions are applied across the cluster regardless of the collection being managed. For example, the condition `{"cores":"<10", "node":"#ANY"}` means that any node must have less than ten Solr cores in total regardless of which collection they belong to.
+
+There are many metrics on which the condition can be based e.g., system load average, heap usage, free disk space etc. The full list of supported metrics can be found at __TODO__ section.
+
+When a node, shard or collection does not satisfy the policy, we call it a *violation*. Solr ensures that cluster management operations minimize the number of violations. The cluster management operations are either invoked manually by us. In future, these cluster management operations may be invoked automatically in response to cluster events such as node being added or lost.
+
+== Collection-Specific Policies
+
+Sometimes a collection may need conditions in addition to those specified in the cluster policy. In such cases, we can create named policies that can be used for specific collections. Firstly, we can use the `set-policy` API to create a new policy and then specify the `policy=<policy_name>` parameter to the CREATE command of the Collection API.
+
+`/admin/collections?action=CREATE&name=coll1&numShards=1&replicationFactor=2&policy=policy1`
+
+The above create collection command will associate a policy named `policy1` with the collection named `coll1`. Only a single policy may be associated with a collection.
+
+Note that the collection-specific policy is applied *in addition* to the cluster policy, i.e., it is not an override but an augmentation. Therefore the collection will follow all conditions laid out in the cluster preferences, cluster policy, and the policy named `policy1`.
+
+You can learn more about collection specific policies in the __TODO__ section.
+
+== Autoscaling APIs
+
+The autoscaling APIs available at `/admin/autoscaling` can be used to read and modify each of the components discussed above.
+
+You can learn more about these APIs in the __TODO__ section.
--- a/solr/solr-ref-guide/src/solrcloud-autoscaling-policy-preferences.adoc
+++ b/solr/solr-ref-guide/src/solrcloud-autoscaling-policy-preferences.adoc
@ -0,0 +1,209 @@
+= SolrCloud Autoscaling Policy and Preferences
+:page-shortname: solrcloud-autoscaling-policy-preferences
+:page-permalink: solrcloud-autoscaling-policy-preferences.html
+:page-toclevels: 2
+:page-tocclass: right
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+The autoscaling policy and preferences are a set of rules and sorting preferences that help Solr select the target of cluster management operations such that the overall load on the cluster is balanced.
+
+== Cluster preferences specification
+
+A preference is a hint to Solr on how to sort nodes based on their utilization. The default cluster preference is to sort by the total number of Solr cores (or replicas) hosted by the node. Therefore, by default, when selecting a node to add a replica, Solr can apply the preferences and choose the node with the least number of cores.
+
+More than one preferences can be added to break ties. For example, we may choose to use free disk space to break ties if number of cores on two nodes are the same so that the node with the higher free disk space can be chosen as the target of the cluster operation.
+
+Each preference is of the following form:
+[source,json]
+----
+{"<sort_order>": "<sort_param>", "precision" : "<precision_val>"}
+----
+
+`sort_order`::
+The value can be either `maximize` or `minimize`. `minimize` sorts the nodes with least value as the least loaded. e.g `{"minimize" : "cores"}` sorts the nodes with the least number of cores as the least loaded node. `{"maximize" : "freedisk"}` sorts the nodes with maximum free disk space as the least loaded node. The objective of the system is to make every node the least loaded. So, e.g. in case of a `MOVEREPLICA` operation, it usually targets the _most loaded_ node and takes load off of it. In a sort of more loaded to less loaded, minimize is akin to sort in descending order and maximize is akin to sorting in ascending order. This is a required parameter.
+
+`sort_param`::
+One and only one of the following supported parameter must be specified:
+1. `cores`: The number of total Solr cores on a node
+2. `freedisk`: The amount of free disk space for Solr's data home directory. This is always in gigabytes.
+3. `sysLoadAvg`: The system load average on a node as reported by the Metrics API under the key `solr.jvm/os.systemLoadAverage`. This is always a double value between 0 and 1 and the higher the value, the more loaded the node is.
+4. `heapUsage`: The heap usage of a node as reported by the Metrics API under the key `solr.jvm/memory.heap.usage`. This is always a double value between 0 and 1 and the higher the value, the more loaded the node is.
+
+`precision`::
+Precision tells the system the minimum (absolute) difference between 2 values to treat them as distinct values. For example, a precision of 10 for `freedisk` means that two nodes whose free disk space is within 10GB of each other should be treated as equal for the purpose of sorting. This helps create ties without which, specifying multiple preferences is not useful. This is an optional parameter whose value must be a positive integer. The maximum value of precision must be less than the maximum value of the `sort_value`, if any.
+
+See the `set-cluster-preferences` API section for details on how to manage cluster preferences.
+
+=== Examples of Cluster Preferences
+
+The following is the default cluster preferences. This is applied automatically by Solr when no explicit cluster preferences have been set using the Autoscaling API.
+[source,json]
+----
+[{"minimize":"cores"}]
+----
+
+In this example, we want to minimize the number of solr cores and in case of tie, maximize the amount of free disk space on each node.
+[source,json]
+----
+[
+  {"minimize" : "cores"},
+  {"maximize" : "freedisk"}
+]
+----
+
+In this example, we add a precision to the `freedisk` parameter so that nodes with free disk space within 10GB of each other are considered equal. In such a case, the tie is broken by minimizing `sysLoadAvg`.
+[source,json]
+----
+[
+  {"minimize" : "cores"},
+  {"maximize" : "freedisk", "precision" : 10},
+  {"minimize" : "sysLoadAvg"}
+]
+----
+
+== Policy specification
+
+A policy is a hard rule to be satisfied by each node. If a node does not satisfy the rule then it is called a `violation`. Solr ensures that the number of violations are minimized while invoking any cluster management operations.
+
+=== Policy attributes
+A policy can have the following attributes:
+
+`cores`::
+This is a special attribute that applies to the entire cluster. It can only be used along with the `node` attribute and no other. This parameter is optional.
+
+`collection`::
+The name of the collection to which the policy rule should apply. If omitted, the rule applies to all collections. This attribute is optional.
+
+`shard`::
+The name of the shard to which the policy rule should apply. If omitted, the rule is applied for all shards in the collection. It supports a special value `#EACH` which means that the rule is applied for each shard in the collection.
+
+`replica`::
+The number of replicas that must exist to satisfy the rule. This must be a positive integer. This is a required attribute.
+
+`strict`::
+An optional boolean value. The default is `true`. If true, the rule must be satisfied. If false, Solr tries to satisfy the rule on a best effort basis but if no node can satisfy the rule then any node may be chosen.
+
+One and only one of the following attribute can be specified in addition to the above attributes:
+
+`node`::
+The name of the node to which the rule should apply. The default value is `#ANY` which means that any node in the cluster may satisfy the rule.
+
+`port`::
+The port of the node to which the rule should apply.
+
+`freedisk`::
+The free disk space in gigabytes of the node. This must be a positive 64-bit integer value.
+
+`host`::
+The host name of the node.
+
+`sysLoadAvg`::
+The system load average of the node as reported by the Metrics API under the key `solr.jvm/os.systemLoadAverage`. This is floating point value between 0 and 1.
+
+`heapUsage`::
+The heap usage of the node as reported by the Metrics API under the key `solr.jvm/memory.heap.usage`. This is floating point value between 0 and 1.
+
+`nodeRole`::
+The role of the node. The only supported value currently is `overseer`.
+
+`ip_1 , ip_2, ip_3, ip_4`:
+The least significant to most significant segments of IP address. For example, for an IP address `192.168.1.2`, `ip_1 = 2`, `ip_2 = 1`, `ip_3 = 168`, `ip_4 = 192`.
+
+`sysprop.<system_property_name>`:
+The system property set on the node on startup.
+
+=== Policy Operators
+
+Each attribute in the policy may specify one of the following operators along with the value.
+
+* `<`: Less than
+* `>`: Greater than
+* `!`: Not
+* None means equal
+
+=== Examples of policy rules
+
+`Example 1`::
+Do not place more than one replica of the same shard on the same node
+
+[source,json]
+----
+{"replica": "<2", "shard": "#EACH", "node": "#ANY"}
+----
+
+`Example 2`::
+Do not place more than 10 cores in any node. This rule can only be added to the cluster policy because it mentions the `cores` attribute that is only applicable cluster-wide.
+[source,json]
+----
+{"cores": "<10", "node": "#ANY"}
+----
+
+`Example 3`::
+Place exactly 1 replica of each shard of collection `xyz` on a node running on port `8983`
+[source,json]
+----
+{"replica": 1, "shard": "#EACH", "collection": "xyz", "port": "8983"}
+----
+
+`Example 4`::
+Place all replicas on a node with system property `availability_zone=us-east-1a`. Note that we have to write this rule in the negative sense i.e. *0* replicas must be on nodes *not* having the sysprop `availability_zone=us-east-1a`
+[source,json]
+----
+{"replica": 0, "sysprop.availability_zone": "!us-east-1a"}
+----
+
+`Example 5`::
+Do not place any replica on a node which has the overseer role. Note that the role is added by the `addRole` collection API. It is *not* automatically the node which is currently the overseer.
+[source,json]
+----
+{"replica": 0, "nodeRole": "overseer"}
+----
+
+`Example 6`::
+Place all replicas in nodes with freedisk more than 500GB. Here again, we have to write the rule in the negative sense.
+[source,json]
+----
+{"replica": 0, "freedisk": "<500"}
+----
+
+`Example 7`::
+Place all replicas in nodes with freedisk more than 500GB when possible. Here we use the strict keyword to signal that this rule is to be honored on a best effort basis.
+[source,json]
+----
+{"replica": 0, "freedisk": "<500", "strict" : false}
+----
+
+
+== Cluster Policy vs Collection-specific Policy
+
+By default, the cluster policy, if it exists, is used automatically for all collections in the cluster. However, we can create named policies which can be attached to a collection at the time of its creation by specifying the policy name along with a `policy` parameter.
+
+When a collection-specific policy is used, the rules in that policy are appended to the rules in the cluster policy and the combination of both are used. Therefore, it is recommended that you do not add rules to collection-specific policy that conflict with the ones in the cluster policy. Doing so will disqualify all nodes in the cluster from matching all criteria and make the policy useless. Also, if `maxShardsPerNode` is specified during the time of collection creation then both `maxShardsPerNode` and the policy rules must be satisfied.
+
+Some attributes such as `cores` can only be used in the cluster policy.
+
+The policy is used by Collection APIs such as:
+
+* create
+* createshard
+* addreplica
+* restore
+* splitshard
+
+In future, the policy and preferences will be used by the Autoscaling framework to automatically change the cluster in response to events such as a node being added or lost.
+
--- a/solr/solr-ref-guide/src/solrcloud-autoscaling.adoc
+++ b/solr/solr-ref-guide/src/solrcloud-autoscaling.adoc
@ -0,0 +1,30 @@
+= SolrCloud Autoscaling
+:page-shortname: solrcloud-autoscaling
+:page-permalink: solrcloud-autoscaling.html
+:page-children: solrcloud-autoscaling-overview, solrcloud-autoscaling-api, solrcloud-autoscaling-policy-preferences
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+The goal of autoscaling is to make SolrCloud cluster management easier by providing a way for changes to the cluster to be more automatic and more intelligent.
+
+Autoscaling includes an API to manage cluster-wide and collection-specific policies and preferences and a rules syntax to define the guidelines for your cluster. Future Solr releases will include features to utilize the policies and preferences so they perform actions automatically when the rules are violated.
+
+The following sections describe the autoscaling features of SolrCloud:
+
+* <<solrcloud-autoscaling-overview.adoc#solrcloud-autoscaling-overview,Overview of Autoscaling in SolrCloud>>
+* <<solrcloud-autoscaling-api.adoc#solrcloud-autoscaling-api,SolrCloud Autoscaling API>>
+* <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,SolrCloud Autoscaling Policy and Preferences>>
--- a/solr/solr-ref-guide/src/solrcloud.adoc
+++ b/solr/solr-ref-guide/src/solrcloud.adoc
@ -1,7 +1,7 @@
 = SolrCloud
 :page-shortname: solrcloud
 :page-permalink: solrcloud.html
-:page-children: getting-started-with-solrcloud, how-solrcloud-works, solrcloud-configuration-and-parameters, rule-based-replica-placement, cross-data-center-replication-cdcr
+:page-children: getting-started-with-solrcloud, how-solrcloud-works, solrcloud-configuration-and-parameters, rule-based-replica-placement, cross-data-center-replication-cdcr, solrcloud-autoscaling
 // Licensed to the Apache Software Foundation (ASF) under one
 // or more contributor license agreements.  See the NOTICE file
 // distributed with this work for additional information
@ -45,3 +45,4 @@ In this section, we'll cover everything you need to know about using Solr in Sol
 ** <<configsets-api.adoc#configsets-api,ConfigSets API>>
 * <<rule-based-replica-placement.adoc#rule-based-replica-placement,Rule-based Replica Placement>>
 * <<cross-data-center-replication-cdcr.adoc#cross-data-center-replication-cdcr,Cross Data Center Replication (CDCR)>>
+* <<solrcloud-autoscaling.adoc#solrcloud-autoscaling,SolrCloud Autoscaling>>