SOLR-13235: Split Collections API Ref Guide page into several smaller child pages

This commit is contained in:
Cassandra Targett 2019-06-12 16:07:53 -05:00
parent 8c5dd4a98b
commit e139c86769
20 changed files with 3086 additions and 2959 deletions

View File

@ -114,6 +114,8 @@ Other Changes
* SOLR-13347: Transaction log to natively support UUID types (Thomas Wöckinger via noble)
* SOLR-13235: Split Collections API Ref Guide page into several smaller child pages (Cassandra Targett)
================== 8.1.2 ==================
Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release.

View File

@ -32,18 +32,18 @@ rejected with an error since there is no logic by which to distribute documents
== Standard Aliases
Standard aliases are created and updated using the <<collections-api.adoc#createalias,CREATEALIAS>> command.
Standard aliases are created and updated using the <<collection-aliasing.adoc#createalias,CREATEALIAS>> command.
The current list of collections that are members of an alias can be verified via the
<<collections-api.adoc#clusterstatus,CLUSTERSTATUS>> command.
<<cluster-node-management.adoc#clusterstatus,CLUSTERSTATUS>> command.
The full definition of all aliases including metadata about that alias (in the case of routed aliases, see below)
can be verified via the <<collections-api.adoc#listaliases,LISTALIASES>> command.
can be verified via the <<collection-aliasing.adoc#listaliases,LISTALIASES>> command.
Alternatively this information is available by checking `/aliases.json` in ZooKeeper with either the native ZooKeeper
client or in the <<cloud-screens.adoc#tree-view,tree page>> of the cloud menu in the admin UI.
Aliases may be deleted via the <<collections-api.adoc#deletealias,DELETEALIAS>> command.
Aliases may be deleted via the <<collection-aliasing.adoc#deletealias,DELETEALIAS>> command.
When deleting an alias, underlying collections are *unaffected*.
TIP: Any alias (standard or routed) that references multiple collections may complicate relevancy.
@ -99,9 +99,9 @@ If you need to store a lot of timestamped data in Solr, such as logs or IoT sens
==== How It Works
First you create a time routed aliases using the <<collections-api.adoc#createalias,CREATEALIAS>> command with the
First you create a time routed aliases using the <<collection-aliasing.adoc#createalias,CREATEALIAS>> command with the
desired router settings.
Most of the settings are editable at a later time using the <<collections-api.adoc#aliasprop,ALIASPROP>> command.
Most of the settings are editable at a later time using the <<collection-aliasing.adoc#aliasprop,ALIASPROP>> command.
The first collection will be created automatically, along with an alias pointing to it.
Each underlying Solr "core" in a collection that is a member of a TRA has a special core property referencing the alias.
@ -137,7 +137,7 @@ Each time a new collection is added, the oldest collections in the TRA are exami
All this happens synchronously, potentially adding seconds to the update request and indexing latency.
+
If `router.preemptiveCreateMath` is configured and if the document arrives within this window then it will occur
asynchronously. See <<collections-api.adoc#time-routed-alias-parameters,Time Routed Alias Parameters>> for more information.
asynchronously. See <<collection-aliasing.adoc#time-routed-alias-parameters,Time Routed Alias Parameters>> for more information.
Any other type of update like a commit or delete is routed by RAUP to all collections.
Generally speaking, this is not a performance concern. When Solr receives a delete or commit wherein nothing is deleted
@ -171,9 +171,9 @@ that must be segregated into collections for cluster management or security reas
==== How It Works
First you create a category routed alias using the <<collections-api.adoc#createalias,CREATEALIAS>> command with the
First you create a category routed alias using the <<collection-aliasing.adoc#createalias,CREATEALIAS>> command with the
desired router settings.
Most of the settings are editable at a later time using the <<collections-api.adoc#aliasprop,ALIASPROP>> command.
Most of the settings are editable at a later time using the <<collection-aliasing.adoc#aliasprop,ALIASPROP>> command.
The alias will be created with a special place-holder collection which will always be named
`myAlias\__CRA__NEW_CATEGORY_ROUTED_ALIAS_WAITING_FOR_DATA\__TEMP`. The first document indexed into the CRA
@ -182,7 +182,7 @@ The alias will be created with a special place-holder collection which will alwa
a new value for the field is encountered.
CAUTION: To guard against runaway collection creation options for limiting the total number of categories, and for
rejecting values that don't match, a regular expression parameter is provided (see <<collections-api.adoc#category-routed-alias-parameters,Category Routed Alias Parameters>> for
rejecting values that don't match, a regular expression parameter is provided (see <<collection-aliasing.adoc#category-routed-alias-parameters,Category Routed Alias Parameters>> for
details).
+
Note that by providing very large or very permissive values for these options you are accepting the risk that

View File

@ -37,7 +37,7 @@ The BlobHandler is automatically registered in the .system collection. The `solr
If you do not use the `-shards` or `-replicationFactor` options, then defaults of numShards=1 and replicationFactor=3 (or maximum nodes in the cluster) will be used.
You can create the `.system` collection with the <<collections-api.adoc#create,CREATE command>> of the Collections API, as in this example:
You can create the `.system` collection with the <<collection-management.adoc#create,CREATE command>> of the Collections API, as in this example:
[.dynamic-tabs]
--

View File

@ -0,0 +1,502 @@
= Cluster and Node Managment Commands
:page-toclevels: 1
:page-tocclass: right
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
A cluster is a set of Solr nodes operating in coordination with each other.
These API commands work with a SolrCloud cluster at the entire cluster level, or on individual nodes.
[[clusterstatus]]
== CLUSTERSTATUS: Cluster Status
Fetch the cluster status including collections, shards, replicas, configuration name as well as collection aliases and cluster properties.
`/admin/collections?action=CLUSTERSTATUS`
=== CLUSTERSTATUS Parameters
`collection`::
The collection or alias name for which information is requested. If omitted, information on all collections in the cluster will be returned. If an alias is supplied, information on the collections in the alias will be returned.
`shard`::
The shard(s) for which information is requested. Multiple shard names can be specified as a comma-separated list.
`\_route_`::
This can be used if you need the details of the shard where a particular document belongs to and you don't know which shard it falls under.
=== CLUSTERSTATUS Response
The response will include the status of the request and the status of the cluster.
=== Examples using CLUSTERSTATUS
*Input*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS
----
*Output*
[source,json]
----
{
"responseHeader":{
"status":0,
"QTime":333},
"cluster":{
"collections":{
"collection1":{
"shards":{
"shard1":{
"range":"80000000-ffffffff",
"state":"active",
"replicas":{
"core_node1":{
"state":"active",
"core":"collection1",
"node_name":"127.0.1.1:8983_solr",
"base_url":"http://127.0.1.1:8983/solr",
"leader":"true"},
"core_node3":{
"state":"active",
"core":"collection1",
"node_name":"127.0.1.1:8900_solr",
"base_url":"http://127.0.1.1:8900/solr"}}},
"shard2":{
"range":"0-7fffffff",
"state":"active",
"replicas":{
"core_node2":{
"state":"active",
"core":"collection1",
"node_name":"127.0.1.1:7574_solr",
"base_url":"http://127.0.1.1:7574/solr",
"leader":"true"},
"core_node4":{
"state":"active",
"core":"collection1",
"node_name":"127.0.1.1:7500_solr",
"base_url":"http://127.0.1.1:7500/solr"}}}},
"maxShardsPerNode":"1",
"router":{"name":"compositeId"},
"replicationFactor":"1",
"znodeVersion": 11,
"autoCreated":"true",
"configName" : "my_config",
"aliases":["both_collections"]
},
"collection2":{
"..."
}
},
"aliases":{ "both_collections":"collection1,collection2" },
"roles":{
"overseer":[
"127.0.1.1:8983_solr",
"127.0.1.1:7574_solr"]
},
"live_nodes":[
"127.0.1.1:7574_solr",
"127.0.1.1:7500_solr",
"127.0.1.1:8983_solr",
"127.0.1.1:8900_solr"]
}
}
----
[[clusterprop]]
== CLUSTERPROP: Cluster Properties
Add, edit or delete a cluster-wide property.
`/admin/collections?action=CLUSTERPROP&name=_propertyName_&val=_propertyValue_`
=== CLUSTERPROP Parameters
`name`::
The name of the property. Supported properties names are `autoAddReplicas`, `legacyCloud` , `location`, `maxCoresPerNode` and `urlScheme`. Other properties can be set
(for example, if you need them for custom plugins) but they must begin with the prefix `ext.`. Unknown properties that don't begin with `ext.` will be rejected.
`val`::
The value of the property. If the value is empty or null, the property is unset.
=== CLUSTERPROP Response
The response will include the status of the request and the properties that were updated or removed. If the status is anything other than "0", an error message will explain why the request failed.
=== Examples using CLUSTERPROP
*Input*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=CLUSTERPROP&name=urlScheme&val=https&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
</response>
----
=== Setting Cluster-Wide Defaults
It is possible to set cluster-wide default values for certain attributes of a collection, using the `defaults` parameter.
*Set/update default values*
[source]
----
curl -X POST -H 'Content-type:application/json' --data-binary '
{
"set-obj-property": {
"defaults" : {
"collection": {
"numShards": 2,
"nrtReplicas": 1,
"tlogReplicas": 1,
"pullReplicas": 1
}
}
}
}' http://localhost:8983/api/cluster
----
*Unset the only value of `nrtReplicas`*
[source]
----
curl -X POST -H 'Content-type:application/json' --data-binary '
{
"set-obj-property": {
"defaults" : {
"collection": {
"nrtReplicas": null
}
}
}
}' http://localhost:8983/api/cluster
----
*Unset all values in `defaults`*
[source]
----
curl -X POST -H 'Content-type:application/json' --data-binary '
{ "set-obj-property" : {
"defaults" : null
}' http://localhost:8983/api/cluster
----
NOTE: Until Solr 7.5, cluster properties supported a `collectionDefaults` key which is now deprecated and
replaced with `defaults`. Using the `collectionDefaults` parameter in Solr 7.4 or 7.5 will continue to work
but the format of the properties will automatically be converted to the new nested structure.
Support for the "collectionDefaults" key will be removed in Solr 9.
[[balanceshardunique]]
== BALANCESHARDUNIQUE: Balance a Property Across Nodes
`/admin/collections?action=BALANCESHARDUNIQUE&collection=_collectionName_&property=_propertyName_`
Insures that a particular property is distributed evenly amongst the physical nodes that make up a collection. If the property already exists on a replica, every effort is made to leave it there. If the property is *not* on any replica on a shard, one is chosen and the property is added.
=== BALANCESHARDUNIQUE Parameters
`collection`::
The name of the collection to balance the property in. This parameter is required.
`property`::
The property to balance. The literal `property.` is prepended to this property if not specified explicitly. This parameter is required.
`onlyactivenodes`::
Defaults to `true`. Normally, the property is instantiated on active nodes only. If this parameter is specified as `false`, then inactive nodes are also included for distribution.
`shardUnique`::
Something of a safety valve. There is one pre-defined property (`preferredLeader`) that defaults this value to `true`. For all other properties that are balanced, this must be set to `true` or an error message will be returned.
=== BALANCESHARDUNIQUE Response
The response will include the status of the request. If the status is anything other than "0", an error message will explain why the request failed.
=== Examples using BALANCESHARDUNIQUE
*Input*
Either of these commands would put the "preferredLeader" property on one replica in every shard in the "collection1" collection.
[source,text]
----
http://localhost:8983/solr/admin/collections?action=BALANCESHARDUNIQUE&collection=collection1&property=preferredLeader&wt=xml
http://localhost:8983/solr/admin/collections?action=BALANCESHARDUNIQUE&collection=collection1&property=property.preferredLeader&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">9</int>
</lst>
</response>
----
Examining the clusterstate after issuing this call should show exactly one replica in each shard that has this property.
[[utilizenode]]
== UTILIZENODE: Utilize a New Node
This command can be used to move some replicas from the existing nodes to either a new node or a less loaded node to reduce the load on the existing node.
This uses your autoscaling policies and preferences to identify which replica needs to be moved. It tries to fix any policy violations first and then it tries to move some load off of the most loaded nodes according to the preferences.
`/admin/collections?action=UTILIZENODE&node=nodeName`
=== UTILIZENODE Parameters
`node`:: The name of the node that needs to be utilized. This parameter is required.
[[replacenode]]
== REPLACENODE: Move All Replicas in a Node to Another
This command recreates replicas in one node (the source) to another node(s) (the target). After each replica is copied, the replicas in the source node are deleted.
For source replicas that are also shard leaders the operation will wait for the number of seconds set with the `timeout` parameter to make sure there's an active replica that can become a leader (either an existing replica becoming a leader or the new replica completing recovery and becoming a leader).
The API uses the Autoscaling framework to find nodes that can satisfy the disk requirements for the new replicas but only when an Autoscaling policy is configured. Refer to <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,Autoscaling Policy and Preferences>> section for more details.
`/admin/collections?action=REPLACENODE&sourceNode=_source-node_&targetNode=_target-node_`
=== REPLACENODE Parameters
`sourceNode`::
The source node from which the replicas need to be copied from. This parameter is required.
`targetNode`::
The target node where replicas will be copied. If this parameter is not provided, Solr will identify nodes automatically based on policies or number of cores in each node.
`parallel`::
If this flag is set to `true`, all replicas are created in separate threads. Keep in mind that this can lead to very high network and disk I/O if the replicas have very large indices. The default is `false`.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
`timeout`::
Time in seconds to wait until new replicas are created, and until leader replicas are fully recovered. The default is `300`, or 5 minutes.
[IMPORTANT]
====
This operation does not hold necessary locks on the replicas that belong to on the source node. So don't perform other collection operations in this period.
====
[[deletenode]]
== DELETENODE: Delete Replicas in a Node
Deletes all replicas of all collections in that node. Please note that the node itself will remain as a live node after this operation.
`/admin/collections?action=DELETENODE&node=nodeName`
=== DELETENODE Parameters
`node`::
The node to be removed. This parameter is required.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
[[addrole]]
== ADDROLE: Add a Role
`/admin/collections?action=ADDROLE&role=_roleName_&node=_nodeName_`
Assigns a role to a given node in the cluster. The only supported role is `overseer`.
Use this command to dedicate a particular node as Overseer. Invoke it multiple times to add more nodes. This is useful in large clusters where an Overseer is likely to get overloaded. If available, one among the list of nodes which are assigned the 'overseer' role would become the overseer. The system would assign the role to any other node if none of the designated nodes are up and running.
=== ADDROLE Parameters
`role`::
The name of the role. The only supported role as of now is `overseer`. This parameter is required.
`node`::
The name of the node that will be assigned the role. It is possible to assign a role even before that node is started. This parameter is started.
=== ADDROLE Response
The response will include the status of the request and the properties that were updated or removed. If the status is anything other than "0", an error message will explain why the request failed.
=== Examples using ADDROLE
*Input*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=ADDROLE&role=overseer&node=192.167.1.2:8983_solr&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
</response>
----
[[removerole]]
== REMOVEROLE: Remove Role
Remove an assigned role. This API is used to undo the roles assigned using ADDROLE operation
`/admin/collections?action=REMOVEROLE&role=_roleName_&node=_nodeName_`
=== REMOVEROLE Parameters
`role`::
The name of the role. The only supported role as of now is `overseer`. This parameter is required.
`node`::
The name of the node where the role should be removed.
=== REMOVEROLE Response
The response will include the status of the request and the properties that were updated or removed. If the status is anything other than "0", an error message will explain why the request failed.
=== Examples using REMOVEROLE
*Input*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=REMOVEROLE&role=overseer&node=192.167.1.2:8983_solr&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
</response>
----
[[overseerstatus]]
== OVERSEERSTATUS: Overseer Status and Statistics
Returns the current status of the overseer, performance statistics of various overseer APIs, and the last 10 failures per operation type.
`/admin/collections?action=OVERSEERSTATUS`
=== Examples using OVERSEERSTATUS
*Input:*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=OVERSEERSTATUS
----
[source,json]
----
{
"responseHeader":{
"status":0,
"QTime":33},
"leader":"127.0.1.1:8983_solr",
"overseer_queue_size":0,
"overseer_work_queue_size":0,
"overseer_collection_queue_size":2,
"overseer_operations":[
"createcollection",{
"requests":2,
"errors":0,
"avgRequestsPerSecond":0.7467088842794136,
"5minRateRequestsPerSecond":7.525069023276674,
"15minRateRequestsPerSecond":10.271274280947182,
"avgTimePerRequest":0.5050685,
"medianRequestTime":0.5050685,
"75thPcRequestTime":0.519016,
"95thPcRequestTime":0.519016,
"99thPcRequestTime":0.519016,
"999thPcRequestTime":0.519016},
"removeshard",{
"..."
}],
"collection_operations":[
"splitshard",{
"requests":1,
"errors":1,
"recent_failures":[{
"request":{
"operation":"splitshard",
"shard":"shard2",
"collection":"example1"},
"response":[
"Operation splitshard caused exception:","org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: No shard with the specified name exists: shard2",
"exception",{
"msg":"No shard with the specified name exists: shard2",
"rspCode":400}]}],
"avgRequestsPerSecond":0.8198143044809885,
"5minRateRequestsPerSecond":8.043840552427673,
"15minRateRequestsPerSecond":10.502079828515368,
"avgTimePerRequest":2952.7164175,
"medianRequestTime":2952.7164175000003,
"75thPcRequestTime":5904.384052,
"95thPcRequestTime":5904.384052,
"99thPcRequestTime":5904.384052,
"999thPcRequestTime":5904.384052},
"..."
],
"overseer_queue":[
"..."
],
"..."
}
----
[[migratestateformat]]
== MIGRATESTATEFORMAT: Migrate Cluster State
A expert level utility API to move a collection from shared `clusterstate.json` ZooKeeper node (created with `stateFormat=1`, the default in all Solr releases prior to 5.0) to the per-collection `state.json` stored in ZooKeeper (created with `stateFormat=2`, the current default) seamlessly without any application down-time.
`/admin/collections?action=MIGRATESTATEFORMAT&collection=<collection_name>`
=== MIGRATESTATEFORMAT Parameters
`collection`::
The name of the collection to be migrated from `clusterstate.json` to its own `state.json` ZooKeeper node. This parameter is required.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
This API is useful in migrating any collections created prior to Solr 5.0 to the more scalable cluster state format now used by default. If a collection was created in any Solr 5.x version or higher, then executing this command is not necessary.

View File

@ -0,0 +1,433 @@
= Collection Aliasing
:page-toclevels: 1
:page-tocclass: right
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
A collection alias is a virtual collection which Solr treats the same as a normal collection. The alias collection may point to one or more real collections.
Some use cases for collection aliasing:
* Time series data
* Reindexing content behind the scenes
[[createalias]]
== CREATEALIAS: Create or Modify an Alias for a Collection
The `CREATEALIAS` action will create a new alias pointing to one or more collections.
Aliases come in 2 flavors: standard and routed.
*Standard aliases* are simple: CREATEALIAS registers the alias name with the names of one or more collections provided
by the command.
If an existing alias exists, it is replaced/updated.
A standard alias can serve as a means to rename a collection, and can be used to atomically swap
which backing/underlying collection is "live" for various purposes.
When Solr searches an alias pointing to multiple collections, Solr will search all shards of all the collections as an
aggregated whole.
While it is possible to send updates to an alias spanning multiple collections, standard aliases have no logic for
distributing documents among the referenced collections so all updates will go to the first collection in the list.
`/admin/collections?action=CREATEALIAS&name=_name_&collections=_collectionlist_`
*Routed aliases* are aliases with additional capabilities to act as a kind of super-collection that route
updates to the correct collection. Routing is data driven and may be based on a temporal field or on categories
specified in a field (normally string based).
See <<aliases.adoc#routed-aliases,Routed Aliases>> for some important high-level information
before getting started.
[source,text]
----
localhost:8983/solr/admin/collections?action=CREATEALIAS&name=timedata&router.start=NOW/DAY&router.field=evt_dt&router.name=time&router.interval=%2B1DAY&router.maxFutureMs=3600000&create-collection.collection.configName=myConfig&create-collection.numShards=2
----
If run on Jan 15, 2018, the above will create an time routed alias named timedata, that contains collections with names prefixed
with `timedata` and an initial collection named `timedata_2018_01_15` will be created immediately. Updates sent to this
alias with a (required) value in `evt_dt` that is before or after 2018-01-15 will be rejected, until the last 60
minutes of 2018-01-15. After 2018-01-15T23:00:00 documents for either 2018-01-15 or 2018-01-16 will be accepted.
As soon as the system receives a document for an allowable time window for which there is no collection it will
automatically create the next required collection (and potentially any intervening collections if `router.interval` is
smaller than `router.maxFutureMs`). Both the initial collection and any subsequent collections will be created using
the specified configset. All collection creation parameters other than `name` are allowed, prefixed
by `create-collection.`
This means that one could, for example, partition their collections by day, and within each daily collection route
the data to shards based on customer id. Such shards can be of any type (NRT, PULL or TLOG), and rule-based replica
placement strategies may also be used.
The values supplied in this command for collection creation will be retained
in alias properties, and can be verified by inspecting `aliases.json` in ZooKeeper.
NOTE: Presently only updates are routed and queries are distributed to all collections in the alias, but future
features may enable routing of the query to the single appropriate collection based on a special parameter or perhaps
a filter on the routed field.
=== CREATEALIAS Parameters
`name`::
The alias name to be created. This parameter is required. If the alias is to be routed it also functions
as a prefix for the names of the dependent collections that will be created. It must therefore adhere to normal
requirements for collection naming.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
==== Standard Alias Parameters
`collections`::
A comma-separated list of collections to be aliased. The collections must already exist in the cluster.
This parameter signals the creation of a standard alias. If it is present all routing parameters are
prohibited. If routing parameters are present this parameter is prohibited.
==== Routed Alias Parameters
Most routed alias parameters become _alias properties_ that can subsequently be inspected and <<aliasprop,modified>>.
`router.name`::
The type of routing to use. Presently only `time` and `category` are valid. This parameter is required.
`router.field`::
The field to inspect to determine which underlying collection an incoming document should be routed to.
This field is required on all incoming documents.
`create-collection.*`::
The `*` wildcard can be replaced with any parameter from the <<collection-management.adoc#create,CREATE>> command except `name`. All other fields
are identical in requirements and naming except that we insist that the configset be explicitly specified.
The configset must be created beforehand, either uploaded or copied and modified.
It's probably a bad idea to use "data driven" mode as schema mutations might happen concurrently leading to errors.
==== Time Routed Alias Parameters
`router.start`::
The start date/time of data for this time routed alias in Solr's standard date/time format (i.e., ISO-8601 or "NOW"
optionally with <<working-with-dates.adoc#date-math,date math>>).
+
The first collection created for the alias will be internally named after this value.
If a document is submitted with an earlier value for router.field then the earliest collection the alias points to then
it will yield an error since it can't be routed. This date/time MUST NOT have a milliseconds component other than 0.
Particularly, this means `NOW` will fail 999 times out of 1000, though `NOW/SECOND`, `NOW/MINUTE`, etc. will work
just fine. This parameter is required.
`TZ`::
The timezone to be used when evaluating any date math in router.start or router.interval. This is equivalent to the
same parameter supplied to search queries, but understand in this case it's persisted with most of the other parameters
as an alias property.
+
If GMT-4 is supplied for this value then a document dated 2018-01-14T21:00:00:01.2345Z would be stored in the
myAlias_2018-01-15_01 collection (assuming an interval of +1HOUR).
+
The default timezone is UTC.
`router.interval`::
A date math expression that will be appended to a timestamp to determine the next collection in the series.
Any date math expression that can be evaluated if appended to a timestamp of the form 2018-01-15T16:17:18 will
work here.
+
This parameter is required.
`router.maxFutureMs`::
The maximum milliseconds into the future that a document is allowed to have in `router.field` for it to be accepted
without error. If there was no limit, than an erroneous value could trigger many collections to be created.
+
The default is 10 minutes.
`router.preemptiveCreateMath`::
A date math expression that results in early creation of new collections.
+
If a document arrives with a timestamp that is after the end time of the most recent collection minus this
interval, then the next (and only the next) collection will be created asynchronously. Without this setting, collections are created
synchronously when required by the document time stamp and thus block the flow of documents until the collection
is created (possibly several seconds). Preemptive creation reduces these hiccups. If set to enough time (perhaps
an hour or more) then if there are problems creating a collection, this window of time might be enough to take
corrective action. However after a successful preemptive creation, the collection is consuming resources without
being used, and new documents will tend to be routed through it only to be routed elsewhere. Also, note that
router.autoDeleteAge is currently evaluated relative to the date of a newly created collection, and so you may
want to increase the delete age by the preemptive window amount so that the oldest collection isn't deleted too
soon. Note that it has to be possible to subtract the interval specified from a date, so if prepending a
minus sign creates invalid date math, this will cause an error. Also note that a document that is itself
destined for a collection that does not exist will still trigger synchronous creation up to that destination collection
but will not trigger additional async preemptive creation. Only one type of collection creation can happen
per document.
Example: `90MINUTES`.
+
This property is blank by default indicating just-in-time, synchronous creation of new collections.
`router.autoDeleteAge`::
A date math expression that results in the oldest collections getting deleted automatically.
+
The date math is relative to the timestamp of a newly created collection (typically close to the current time),
and thus this must produce an earlier time via rounding and/or subtracting.
Collections to be deleted must have a time range that is entirely before the computed age.
Collections are considered for deletion immediately prior to new collections getting created.
Example: `/DAY-90DAYS`.
+
The default is not to delete.
==== Category Routed Alias Parameters
`router.maxCardinality`::
The maximum number of categories allowed for this alias.
This setting safeguards against the inadvertent creation of an infinite number of collections in the event of bad data.
`router.mustMatch`::
A regular expression that the value of the field specified by `router.field` must match before a corresponding
collection will be created. Note that changing this setting after data has been added will not alter the data already
indexed. Any valid Java regular expression pattern may be specified. This expression is pre-compiled at the start of
each request so batching of updates is strongly recommended. Overly complex patterns will produce cpu
or garbage collecting overhead during indexing as determined by the JVM's implementation of regular expressions.
=== CREATEALIAS Response
The output will simply be a responseHeader with details of the time it took to process the request.
To confirm the creation of the alias, you can look in the Solr Admin UI, under the Cloud section and find the
`aliases.json` file. The initial collection for routed aliases should also be visible in various parts of the admin UI.
=== Examples using CREATEALIAS
*Input*
Create an alias named "testalias" and link it to the collections named "anotherCollection" and "testCollection".
// tag::createalias-simple-example[]
[source,text]
----
http://localhost:8983/solr/admin/collections?action=CREATEALIAS&name=testalias&collections=anotherCollection,testCollection&wt=xml
----
//end::createalias-simple-example[]
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">122</int>
</lst>
</response>
----
*Input*
Create an alias named "myTimeData" for data beginning on `2018-01-15` in the UTC time zone and partitioning daily
based on the `evt_dt` field in the incoming documents. Data more than one hour beyond the latest (most recent)
partition is to be rejected and collections are created using a configset named "myConfig".
[source,text]
----
http://localhost:8983/solr/admin/collections?action=CREATEALIAS&name=myTimeData&router.start=NOW/DAY&router.field=evt_dt&router.name=time&router.interval=%2B1DAY&router.maxFutureMs=3600000&create-collection.collection.configName=myConfig&create-collection.numShards=2
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1234</int>
</lst>
</response>
----
*Input*
A somewhat contrived example demonstrating the <<v2-api.adoc#top-v2-api,V2 API>> usage and additional collection creation options.
Notice that the collection creation parameters follow the v2 API naming convention, not the v1 naming conventions.
[source,json]
----
POST /api/c
{
"create-routed-alias" : {
"name": "somethingTemporalThisWayComes",
"router" : {
"name": "time",
"field": "evt_dt",
"start":"NOW/MINUTE",
"interval":"+2HOUR",
"maxFutureMs":"14400000"
},
"create-collection" : {
"config":"_default",
"router": {
"name":"implicit",
"field":"foo_s"
},
"shards":"foo,bar,baz",
"numShards": 3,
"tlogReplicas":1,
"pullReplicas":1,
"maxShardsPerNode":2,
"properties" : {
"foobar":"bazbam"
}
}
}
}
----
*Output*
[source,xml]
----
{
"responseHeader": {
"status": 0,
"QTime": 1234
}
}
----
[[listaliases]]
== LISTALIASES: List of all aliases in the cluster
`/admin/collections?action=LISTALIASES`
The LISTALIASES action does not take any parameters.
=== LISTALIASES Response
The output will contain a list of aliases with the corresponding collection names.
=== Examples using LISTALIASES
*Input*
List the existing aliases, requesting information as XML from Solr:
[source,text]
----
http://localhost:8983/solr/admin/collections?action=LISTALIASES&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
<lst name="aliases">
<str name="testalias1">collection1</str>
<str name="testalias2">collection1,collection2</str>
</lst>
<lst name="properties">
<lst name="testalias1"/>
<lst name="testalias2">
<str name="someKey">someValue</str>
</lst>
</lst>
</response>
----
[[aliasprop]]
== ALIASPROP: Modify Alias Properties for a Collection
The `ALIASPROP` action modifies the properties (metadata) on an alias. If a key is set with a value that is empty it will be removed.
`/admin/collections?action=ALIASPROP&name=_name_&property.someKey=somevalue`
WARNING: This command allows you to revise any property. No alias specific validation is performed.
Routed aliases may cease to function, function incorrectly or cause errors if property values
are set carelessly.
=== ALIASPROP Parameters
`name`::
The alias name on which to set properties. This parameter is required.
`property.*`::
The name of the property to be modified replaces '*', the value for the parameter is passed as the value for the property.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
=== ALIASPROP Response
The output will simply be a responseHeader with details of the time it took to process the request.
To confirm the creation of the property or properties, you can look in the Solr Admin UI, under the Cloud section and
find the `aliases.json` file or use the LISTALIASES api command.
=== Examples using ALIASPROP
*Input*
For an alias named "testalias2" and set the value "someValue" for a property of "someKey" and "otherValue" for "otherKey".
[source,text]
----
http://localhost:8983/solr/admin/collections?action=ALIASPROP&name=testalias2&property.someKey=someValue&property.otherKey=otherValue&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">122</int>
</lst>
</response>
----
[[deletealias]]
== DELETEALIAS: Delete a Collection Alias
`/admin/collections?action=DELETEALIAS&name=_name_`
=== DELETEALIAS Parameters
`name`::
The name of the alias to delete. This parameter is required.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
=== DELETEALIAS Response
The output will simply be a responseHeader with details of the time it took to process the request.
To confirm the removal of the alias, you can look in the Solr Admin UI, under the Cloud section, and
find the `aliases.json` file.
=== Examples using DELETEALIAS
*Input*
Remove the alias named "testalias".
[source,text]
----
http://localhost:8983/solr/admin/collections?action=DELETEALIAS&name=testalias&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">117</int>
</lst>
</response>
----

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -34,6 +34,6 @@ image::images/collections-core-admin/collection-admin.png[image,width=653,height
Replicas can be deleted by clicking the red "X" next to the replica name.
If the shard is inactive, for example after a <<collections-api.adoc#splitshard,SPLITSHARD action>>, an option to delete the shard will appear as a red "X" next to the shard name.
If the shard is inactive, for example after a <<shard-management.adoc#splitshard,SPLITSHARD action>>, an option to delete the shard will appear as a red "X" next to the shard name.
image::images/collections-core-admin/DeleteShard.png[image,width=486,height=250]

View File

@ -28,7 +28,7 @@ _linked_ to the same `withCollection`.
== Create a Colocated Collection
The Create Collection API supports a parameter named `withCollection` which can be used to specify a collection
with which the replicas of the newly created collection should be colocated. See <<collections-api.adoc#create,Create Collection API>>.
with which the replicas of the newly created collection should be colocated. See <<collection-management.adoc#create,Create Collection API>>.
`/admin/collections?action=CREATE&name=techproducts&numShards=1&replicationFactor=2&withCollection=tech_categories`
@ -36,7 +36,7 @@ In the above example, all replicas of the `techproducts` collection will be colo
replica of the `tech_categories` collection.
== Colocating Existing Collections
When collections already exist beforehand, the <<collections-api.adoc#modifycollection, Modify Collection API>> can be
When collections already exist beforehand, the <<collection-management.adoc#modifycollection, Modify Collection API>> can be
used to set the `withCollection` parameter so that the two collections can be linked. This will *not* trigger
changes to the cluster automatically because moving a large number of replicas immediately might de-stabilize the system.
Instead, it is recommended that the Suggestions UI page should be consulted on the operations that can be performed
@ -47,7 +47,7 @@ Example:
== Deleting Colocated Collections
Deleting a collection which has been linked to another will fail unless the link itself is deleted first by using the
<<collections-api.adoc#modifycollection, Modify Collection API>> to un-set the `withCollection` attribute.
<<collection-management.adoc#modifycollection, Modify Collection API>> to un-set the `withCollection` attribute.
Example:
`/admin/collections?action=MODIFYCOLLECTION&collection=techproducts&withCollection=`

View File

@ -148,7 +148,7 @@ This can be useful to create a chroot path in ZooKeeper before first cluster sta
This command will add or modify a single cluster property in `clusterprops.json`. Use this command instead of the usual getfile \-> edit \-> putfile cycle.
Unlike the CLUSTERPROP command on the <<collections-api.adoc#clusterprop,Collections API>>, this command does *not* require a running Solr cluster.
Unlike the CLUSTERPROP command on the <<cluster-node-management.adoc#clusterprop,Collections API>>, this command does *not* require a running Solr cluster.
[source,bash]
----

View File

@ -299,7 +299,7 @@ This example uses the `ranges` parameter with hash ranges 0-500, 501-1000 and 10
The `targetCore` must already exist and must have a compatible schema with the `core` index. A commit is automatically called on the `core` index before it is split.
This command is used as part of the <<collections-api.adoc#splitshard,SPLITSHARD>> command but it can be used for non-cloud Solr cores as well. When used against a non-cloud core without `split.key` parameter, this action will split the source index and distribute its documents alternately so that each split piece contains an equal number of documents. If the `split.key` parameter is specified then only documents having the same route key will be split from the source index.
This command is used as part of the <<shard-management.adoc#splitshard,SPLITSHARD>> command but it can be used for non-cloud Solr cores as well. When used against a non-cloud core without `split.key` parameter, this action will split the source index and distribute its documents alternately so that each split piece contains an equal number of documents. If the `split.key` parameter is specified then only documents having the same route key will be split from the source index.
[[coreadmin-requeststatus]]
== REQUESTSTATUS

View File

@ -300,7 +300,7 @@ When upgrading to Solr 7.6, users should be aware of the following major changes
*Collections*
* The JSON parameter to set cluster-wide default cluster properties with the <<collections-api.adoc#clusterprop,CLUSTERPROP>> command has changed.
* The JSON parameter to set cluster-wide default cluster properties with the <<cluster-node-management.adoc#clusterprop,CLUSTERPROP>> command has changed.
+
The old syntax nested the defaults into a property named `clusterDefaults`. The new syntax uses only `defaults`. The command to use is still `set-obj-property`.
+
@ -429,7 +429,7 @@ When upgrading to Solr 7.3, users should be aware of the following major changes
+
This means to upgrade to Solr 8 in the future, you will need to be on Solr 7.3 or higher.
* Replicas which are not up-to-date are no longer allowed to become leader. Use the <<collections-api.adoc#forceleader,FORCELEADER command>> of the Collections API to allow these replicas become leader.
* Replicas which are not up-to-date are no longer allowed to become leader. Use the <<shard-management.adoc#forceleader,FORCELEADER command>> of the Collections API to allow these replicas become leader.
*Spatial*
@ -479,7 +479,7 @@ When upgrading to Solr 7.1, users should be aware of the following major changes
+
Existing users of this feature should not have to change anything. However, they should note these changes:
** Behavior: Changing the `autoAddReplicas` property from disabled (`false`) to enabled (`true`) using <<collections-api.adoc#modifycollection,MODIFYCOLLECTION API>> no longer replaces down replicas for the collection immediately. Instead, replicas are only added if a node containing them went down while `autoAddReplicas` was enabled. The parameters `autoReplicaFailoverBadNodeExpiration` and `autoReplicaFailoverWorkLoopDelay` are no longer used.
** Behavior: Changing the `autoAddReplicas` property from disabled (`false`) to enabled (`true`) using <<collection-management.adoc#modifycollection,MODIFYCOLLECTION API>> no longer replaces down replicas for the collection immediately. Instead, replicas are only added if a node containing them went down while `autoAddReplicas` was enabled. The parameters `autoReplicaFailoverBadNodeExpiration` and `autoReplicaFailoverWorkLoopDelay` are no longer used.
** Deprecations: Enabling/disabling autoAddReplicas cluster-wide with the API will be deprecated; use suspend/resume trigger APIs with `name=".auto_add_replicas"` instead.
+
More information about the changes to this feature can be found in the section <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas,SolrCloud Automatically Adding Replicas>>.

View File

@ -28,8 +28,8 @@ NOTE: SolrCloud Backup/Restore requires a shared file system mounted at the same
Two commands are available:
* `action=BACKUP`: This command backs up Solr indexes and configurations. More information is available in the section <<collections-api.adoc#backup,Backup Collection>>.
* `action=RESTORE`: This command restores Solr indexes and configurations. More information is available in the section <<collections-api.adoc#restore,Restore Collection>>.
* `action=BACKUP`: This command backs up Solr indexes and configurations. More information is available in the section <<collection-management.adoc#backup,Backup Collection>>.
* `action=RESTORE`: This command restores Solr indexes and configurations. More information is available in the section <<collection-management.adoc#restore,Restore Collection>>.
== Standalone Mode Backups

View File

@ -186,7 +186,7 @@ Once the indexes have been cleared, you can start reindexing by re-running the o
=== Index to Another Collection
In cases where you cannot take a production collection offline to delete all the documents, one option is to use Solr's <<collections-api.adoc#createalias,collection alias>> feature.
In cases where you cannot take a production collection offline to delete all the documents, one option is to use Solr's <<collection-aliasing.adoc#createalias,collection alias>> feature.
This option is only available for Solr installations running in SolrCloud mode.
@ -199,7 +199,7 @@ Here is an example of creating an alias that points to a single collection:
[source,bash]
http://localhost:8983/solr/admin/collections?action=CREATEALIAS&name=myData&collections=newCollection
Once the alias is in place and you are satisfied you no longer need the old data, you can delete the old collection with the <<collections-api.adoc#delete,DELETE command>> of the Collections API:
Once the alias is in place and you are satisfied you no longer need the old data, you can delete the old collection with the <<collection-management.adoc#delete,DELETE command>> of the Collections API:
[source,bash]
http://localhost:8983/solr/admin/collections?action=DELETE&name=oldCollection

View File

@ -0,0 +1,391 @@
= Replica Management Commands
:page-toclevels: 1
:page-tocclass: right
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
A replica is a physical copy of a shard.
[[addreplica]]
== ADDREPLICA: Add Replica
Add one or more replicas to a shard in a collection. The node name can be specified if the replica is to be created in a specific node. Otherwise, a set of nodes can be specified and the most suitable ones among them will be chosen to create the replica(s).
The API uses the Autoscaling framework to find nodes that can satisfy the disk requirements for the new replica(s) but only when an Autoscaling preferences or policy is configured. Refer to <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,Autoscaling Policy and Preferences>> section for more details.
`/admin/collections?action=ADDREPLICA&collection=_collection_&shard=_shard_&node=_nodeName_`
=== ADDREPLICA Parameters
`collection`::
The name of the collection where the replica should be created. This parameter is required.
`shard`::
The name of the shard to which replica is to be added.
+
If `shard` is not specified, then `\_route_` must be.
`\_route_`::
If the exact shard name is not known, users may pass the `\_route_` value and the system would identify the name of the shard.
+
Ignored if the `shard` parameter is also specified.
`node`::
The name of the node where the replica should be created (optional).
`createNodeSet`::
A comma-separated list of nodes among which the best ones will be chosen to place the replicas (optional)
+
The format is a comma-separated list of node_names, such as `localhost:8983_solr,localhost:8984_solr,localhost:8985_solr`.
NOTE: If neither `node` nor `createNodeSet` are specified then the best node(s) from among all the live nodes in the cluster are chosen.
`instanceDir`::
The instanceDir for the core that will be created.
`dataDir`::
The directory in which the core should be created.
`type`::
The type of replica to create. These possible values are allowed:
+
* `nrt`: The NRT type maintains a transaction log and updates its index locally. This is the default and the most commonly used.
* `tlog`: The TLOG type maintains a transaction log but only updates its index via replication.
* `pull`: The PULL type does not maintain a transaction log and only updates its index via replication. This type is not eligible to become a leader.
+
See the section <<shards-and-indexing-data-in-solrcloud.adoc#types-of-replicas,Types of Replicas>> for more information about replica type options.
`nrtReplicas`::
The number of `nrt` replicas that should be created (optional, defaults to 1 if `type` is `nrt` otherwise 0).
`tlogReplicas`::
The number of `tlog` replicas that should be created (optional, defaults to 1 if `type` is `tlog` otherwise 0).
`pullReplicas`::
The number of `pull` replicas that should be created (optional, defaults to 1 if `type` is `pull` otherwise 0).
`property._name_=_value_`::
Set core property _name_ to _value_. See <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details about supported properties and values.
`waitForFinalState`::
If `true`, the request will complete only when all affected replicas become active. The default is `false`, which means that the API will return the status of the single action, which may be before the new replica is online and active.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>
=== Examples using ADDREPLICA
*Input*
Create a replica for the "test" collection on the node "192.167.1.2:8983_solr".
[source,text]
----
http://localhost:8983/solr/admin/collections?action=ADDREPLICA&collection=test2&shard=shard2&node=192.167.1.2:8983_solr&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">3764</int>
</lst>
<lst name="success">
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">3450</int>
</lst>
<str name="core">test2_shard2_replica4</str>
</lst>
</lst>
</response>
----
*Input*
Create a replica for the "gettingstarted" collection with one PULL replica and one TLOG replica.
[source,text]
----
http://localhost:8983/solr/admin/collections?action=addreplica&collection=gettingstarted&shard=shard1&tlogReplicas=1&pullReplicas=1
----
*Output*
[source,json]
----
{
"responseHeader": {
"status": 0,
"QTime": 784
},
"success": {
"127.0.1.1:7574_solr": {
"responseHeader": {
"status": 0,
"QTime": 257
},
"core": "gettingstarted_shard1_replica_p11"
},
"127.0.1.1:8983_solr": {
"responseHeader": {
"status": 0,
"QTime": 295
},
"core": "gettingstarted_shard1_replica_t10"
}
}
}
----
[[movereplica]]
== MOVEREPLICA: Move a Replica to a New Node
This command moves a replica from one node to a new node. In case of shared filesystems the `dataDir` will be reused.
The API uses the Autoscaling framework to find nodes that can satisfy the disk requirements for the replica to be moved but only when an Autoscaling policy is configured. Refer to <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,Autoscaling Policy and Preferences>> section for more details.
`/admin/collections?action=MOVEREPLICA&collection=collection&shard=shard&replica=replica&sourceNode=nodeName&targetNode=nodeName`
=== MOVEREPLICA Parameters
`collection`::
The name of the collection. This parameter is required.
`shard`::
The name of the shard that the replica belongs to. This parameter is required.
`replica`::
The name of the replica. This parameter is required.
`sourceNode`::
The name of the node that contains the replica. This parameter is required.
`targetNode`::
The name of the destination node. This parameter is required.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
[[deletereplica]]
== DELETEREPLICA: Delete a Replica
Deletes a named replica from the specified collection and shard.
If the corresponding core is up and running the core is unloaded, the entry is removed from the clusterstate, and (by default) delete the instanceDir and dataDir. If the node/core is down, the entry is taken off the clusterstate and if the core comes up later it is automatically unregistered.
`/admin/collections?action=DELETEREPLICA&collection=_collection_&shard=_shard_&replica=_replica_`
=== DELETEREPLICA Parameters
`collection`::
The name of the collection. This parameter is required.
`shard`::
The name of the shard that includes the replica to be removed. This parameter is required.
`replica`::
The name of the replica to remove.
+
If `count` is used instead, this parameter is not required. Otherwise, this parameter must be supplied.
`count`::
The number of replicas to remove. If the requested number exceeds the number of replicas, no replicas will be deleted. If there is only one replica, it will not be removed.
+
If `replica` is used instead, this parameter is not required. Otherwise, this parameter must be supplied.
`deleteInstanceDir`::
By default Solr will delete the entire instanceDir of the replica that is deleted. Set this to `false` to prevent the instance directory from being deleted.
`deleteDataDir`::
By default Solr will delete the dataDir of the replica that is deleted. Set this to `false` to prevent the data directory from being deleted.
`deleteIndex`::
By default Solr will delete the index of the replica that is deleted. Set this to `false` to prevent the index directory from being deleted.
`onlyIfDown`::
When set to `true`, no action will be taken if the replica is active. Default `false`.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
=== Examples using DELETEREPLICA
*Input*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=DELETEREPLICA&collection=test2&shard=shard2&replica=core_node3&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">110</int>
</lst>
</response>
----
[[addreplicaprop]]
== ADDREPLICAPROP: Add Replica Property
Assign an arbitrary property to a particular replica and give it the value specified. If the property already exists, it will be overwritten with the new value.
`/admin/collections?action=ADDREPLICAPROP&collection=collectionName&shard=shardName&replica=replicaName&property=propertyName&property.value=value`
=== ADDREPLICAPROP Parameters
`collection`::
The name of the collection the replica belongs to. This parameter is required.
`shard`::
The name of the shard the replica belongs to. This parameter is required.
`replica`::
The replica, e.g., `core_node1`. This parameter is required.
`property`::
The name of the property to add. This property is required.
+
This will have the literal `property.` prepended to distinguish it from system-maintained properties. So these two forms are equivalent:
+
`property=special`
+
and
+
`property=property.special`
`property.value`::
The value to assign to the property. This parameter is required.
`shardUnique`::
If `true`, then setting this property in one replica will remove the property from all other replicas in that shard. The default is `false`.
+
There is one pre-defined property `preferredLeader` for which `shardUnique` is forced to `true` and an error returned if `shardUnique` is explicitly set to `false`.
+
`PreferredLeader` is a boolean property. Any value assigned that is not equal (case insensitive) to `true` will be interpreted as `false` for `preferredLeader`.
=== ADDREPLICAPROP Response
The response will include the status of the request. If the status is anything other than "0", an error message will explain why the request failed.
=== Examples using ADDREPLICAPROP
*Input*
This command would set the "preferredLeader" property (`property.preferredLeader`) to "true" on "core_node1", and remove that property from any other replica in the shard.
[source,text]
----
http://localhost:8983/solr/admin/collections?action=ADDREPLICAPROP&shard=shard1&collection=collection1&replica=core_node1&property=preferredLeader&property.value=true&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">46</int>
</lst>
</response>
----
*Input*
This pair of commands will set the "testprop" property (`property.testprop`) to 'value1' and 'value2' respectively for two nodes in the same shard.
[source,text]
----
http://localhost:8983/solr/admin/collections?action=ADDREPLICAPROP&shard=shard1&collection=collection1&replica=core_node1&property=testprop&property.value=value1
http://localhost:8983/solr/admin/collections?action=ADDREPLICAPROP&shard=shard1&collection=collection1&replica=core_node3&property=property.testprop&property.value=value2
----
*Input*
This pair of commands would result in "core_node_3" having the "testprop" property (`property.testprop`) value set because the second command specifies `shardUnique=true`, which would cause the property to be removed from "core_node_1".
[source,text]
----
http://localhost:8983/solr/admin/collections?action=ADDREPLICAPROP&shard=shard1&collection=collection1&replica=core_node1&property=testprop&property.value=value1
http://localhost:8983/solr/admin/collections?action=ADDREPLICAPROP&shard=shard1&collection=collection1&replica=core_node3&property=testprop&property.value=value2&shardUnique=true
----
[[deletereplicaprop]]
== DELETEREPLICAPROP: Delete Replica Property
Deletes an arbitrary property from a particular replica.
`/admin/collections?action=DELETEREPLICAPROP&collection=collectionName&shard=_shardName_&replica=_replicaName_&property=_propertyName_`
=== DELETEREPLICAPROP Parameters
`collection`::
The name of the collection the replica belongs to. This parameter is required.
`shard`::
The name of the shard the replica belongs to. This parameter is required.
`replica`::
The replica, e.g., `core_node1`. This parameter is required.
`property`::
The property to add. This will have the literal `property.` prepended to distinguish it from system-maintained properties. So these two forms are equivalent:
+
`property=special`
+
and
+
`property=property.special`
=== DELETEREPLICAPROP Response
The response will include the status of the request. If the status is anything other than "0", an error message will explain why the request failed.
=== Examples using DELETEREPLICAPROP
*Input*
This command would delete the preferredLeader (`property.preferredLeader`) from core_node1.
[source,text]
----
http://localhost:8983/solr/admin/collections?action=DELETEREPLICAPROP&shard=shard1&collection=collection1&replica=core_node1&property=preferredLeader&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">9</int>
</lst>
</response>
----

View File

@ -69,7 +69,7 @@ The nodes are sorted first and the rules are used to sort them. This ensures tha
== Rules for New Shards
The rules are persisted along with collection state. So, when a new replica is created, the system will assign replicas satisfying the rules. When a new shard is created as a result of using the Collection API's <<collections-api.adoc#createshard,CREATESHARD command>>, ensure that you have created rules specific for that shard name. Rules can be altered using the <<collections-api.adoc#modifycollection,MODIFYCOLLECTION command>>. However, it is not required to do so if the rules do not specify explicit shard names. For example, a rule such as `shard:shard1,replica:*,ip_3:168:`, will not apply to any new shard created. But, if your rule is `replica:*,ip_3:168`, then it will apply to any new shard created.
The rules are persisted along with collection state. So, when a new replica is created, the system will assign replicas satisfying the rules. When a new shard is created as a result of using the Collection API's <<shard-management.adoc#createshard,CREATESHARD command>>, ensure that you have created rules specific for that shard name. Rules can be altered using the <<collection-management.adoc#modifycollection,MODIFYCOLLECTION command>>. However, it is not required to do so if the rules do not specify explicit shard names. For example, a rule such as `shard:shard1,replica:*,ip_3:168:`, will not apply to any new shard created. But, if your rule is `replica:*,ip_3:168`, then it will apply to any new shard created.
The same is applicable to shard splitting. Shard splitting is treated exactly the same way as shard creation. Even though `shard1_1` and `shard1_2` may be created from `shard1`, the rules treat them as distinct, unrelated shards.
@ -174,4 +174,4 @@ Rules are specified per collection during collection creation as request paramet
snitch=class:EC2Snitch&rule=shard:*,replica:1,dc:dc1&rule=shard:*,replica:<2,dc:dc3
----
These rules are persisted in `clusterstate.json` in ZooKeeper and are available throughout the lifetime of the collection. This enables the system to perform any future node allocation without direct user interaction. The rules added during collection creation can be modified later using the <<collections-api.adoc#modifycollection,MODIFYCOLLECTION>> API.
These rules are persisted in `clusterstate.json` in ZooKeeper and are available throughout the lifetime of the collection. This enables the system to perform any future node allocation without direct user interaction. The rules added during collection creation can be modified later using the <<collection-management.adoc#modifycollection,MODIFYCOLLECTION>> API.

View File

@ -0,0 +1,321 @@
= Shard Management Commands
:page-toclevels: 1
:page-tocclass: right
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
In SolrCloud, a shard is a logical partition of a collection. This partition stores part of the entire index for a collection.
The number of shards you have helps to determine how many documents a single collection can contain in total, and also impacts search performance.
[[splitshard]]
== SPLITSHARD: Split a Shard
`/admin/collections?action=SPLITSHARD&collection=_name_&shard=_shardID_`
Splitting a shard will take an existing shard and break it into two pieces which are written to disk as two (new) shards. The original shard will continue to contain the same data as-is but it will start re-routing requests to the new shards. The new shards will have as many replicas as the original shard. A soft commit is automatically issued after splitting a shard so that documents are made visible on sub-shards. An explicit commit (hard or soft) is not necessary after a split operation because the index is automatically persisted to disk during the split operation.
This command allows for seamless splitting and requires no downtime. A shard being split will continue to accept query and indexing requests and will automatically start routing requests to the new shards once this operation is complete. This command can only be used for SolrCloud collections created with `numShards` parameter, meaning collections which rely on Solr's hash-based routing mechanism.
The split is performed by dividing the original shard's hash range into two equal partitions and dividing up the documents in the original shard according to the new sub-ranges. Two parameters discussed below, `ranges` and `split.key` provide further control over how the split occurs.
The newly created shards will have as many replicas as the parent shard, of the same replica types.
When using `splitMethod=rewrite` (default) you must ensure that the node running the leader of the parent shard has enough free disk space i.e., more than twice the index size, for the split to succeed. The API uses the Autoscaling framework to find nodes that can satisfy the disk requirements for the new replicas but only when an Autoscaling policy is configured. Refer to <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,Autoscaling Policy and Preferences>> section for more details.
Also, the first replicas of resulting sub-shards will always be placed on the shard leader node, which may cause Autoscaling policy violations that need to be resolved either automatically (when appropriate triggers are in use) or manually.
Shard splitting can be a long running process. In order to avoid timeouts, you should run this as an <<collections-api.adoc#asynchronous-calls,asynchronous call>>.
=== SPLITSHARD Parameters
`collection`::
The name of the collection that includes the shard to be split. This parameter is required.
`shard`::
The name of the shard to be split. This parameter is required when `split.key` is not specified.
`ranges`::
A comma-separated list of hash ranges in hexadecimal, such as `ranges=0-1f4,1f5-3e8,3e9-5dc`.
+
This parameter can be used to divide the original shard's hash range into arbitrary hash range intervals specified in hexadecimal. For example, if the original hash range is `0-1500` then adding the parameter: `ranges=0-1f4,1f5-3e8,3e9-5dc` will divide the original shard into three shards with hash range `0-500`, `501-1000`, and `1001-1500` respectively.
`split.key`::
The key to use for splitting the index.
+
This parameter can be used to split a shard using a route key such that all documents of the specified route key end up in a single dedicated sub-shard. Providing the `shard` parameter is not required in this case because the route key is enough to figure out the right shard. A route key which spans more than one shard is not supported.
+
For example, suppose `split.key=A!` hashes to the range `12-15` and belongs to shard 'shard1' with range `0-20`. Splitting by this route key would yield three sub-shards with ranges `0-11`, `12-15` and `16-20`. Note that the sub-shard with the hash range of the route key may also contain documents for other route keys whose hash ranges overlap.
`numSubShards`::
The number of sub-shards to split the parent shard into. Allowed values for this are in the range of `2`-`8` and defaults to `2`.
+
This parameter can only be used when `ranges` or `split.key` are not specified.
`splitMethod`::
Currently two methods of shard splitting are supported:
* `splitMethod=rewrite` (default) after selecting documents to retain in each partition this method creates sub-indexes from
scratch, which is a lengthy CPU- and I/O-intensive process but results in optimally-sized sub-indexes that don't contain
any data from documents not belonging to each partition.
* `splitMethod=link` uses file system-level hard links for creating copies of the original index files and then only modifies the
file that contains the list of deleted documents in each partition. This method is many times quicker and lighter on resources than the
`rewrite` method but the resulting sub-indexes are still as large as the original index because they still contain data from documents not
belonging to the partition. This slows down the replication process and consumes more disk space on replica nodes (the multiple hard-linked
copies don't occupy additional disk space on the leader node, unless hard-linking is not supported).
`splitFuzz`::
A float value (default is 0.0f, must be smaller than 0.5f) that allows to vary the sub-shard ranges
by this percentage of total shard range, odd shards being larger and even shards being smaller.
`property._name_=_value_`::
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details on supported properties and values.
`waitForFinalState`::
If `true`, the request will complete only when all affected replicas become active. The default is `false`, which means that the API will return the status of the single action, which may be before the new replica is online and active.
`timing`::
If `true` then each stage of processing will be timed and a `timing` section will be included in response.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>
=== SPLITSHARD Response
The output will include the status of the request and the new shard names, which will use the original shard as their basis, adding an underscore and a number. For example, "shard1" will become "shard1_0" and "shard1_1". If the status is anything other than "success", an error message will explain why the request failed.
=== Examples using SPLITSHARD
*Input*
Split shard1 of the "anotherCollection" collection.
[source,text]
----
http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=anotherCollection&shard=shard1&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">6120</int>
</lst>
<lst name="success">
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">3673</int>
</lst>
<str name="core">anotherCollection_shard1_1_replica1</str>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">3681</int>
</lst>
<str name="core">anotherCollection_shard1_0_replica1</str>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">6008</int>
</lst>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">6007</int>
</lst>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">71</int>
</lst>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
<str name="core">anotherCollection_shard1_1_replica1</str>
<str name="status">EMPTY_BUFFER</str>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
<str name="core">anotherCollection_shard1_0_replica1</str>
<str name="status">EMPTY_BUFFER</str>
</lst>
</lst>
</response>
----
[[createshard]]
== CREATESHARD: Create a Shard
Shards can only created with this API for collections that use the 'implicit' router (i.e., when the collection was created, `router.name=implicit`). A new shard with a name can be created for an existing 'implicit' collection.
Use SPLITSHARD for collections created with the 'compositeId' router (`router.key=compositeId`).
`/admin/collections?action=CREATESHARD&shard=_shardName_&collection=_name_`
The default values for `replicationFactor` or `nrtReplicas`, `tlogReplicas`, `pullReplicas` from the collection is used to determine the number of replicas to be created for the new shard. This can be customized by explicitly passing the corresponding parameters to the request.
The API uses the Autoscaling framework to find the best possible nodes in the cluster when an Autoscaling preferences or policy is configured. Refer to <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,Autoscaling Policy and Preferences>> section for more details.
=== CREATESHARD Parameters
`collection`::
The name of the collection that includes the shard to be split. This parameter is required.
`shard`::
The name of the shard to be created. This parameter is required.
`createNodeSet`::
Allows defining the nodes to spread the new collection across. If not provided, the CREATESHARD operation will create shard-replica spread across all live Solr nodes.
+
The format is a comma-separated list of node_names, such as `localhost:8983_solr,localhost:8984_solr,localhost:8985_solr`.
`nrtReplicas`::
The number of `nrt` replicas that should be created for the new shard (optional, the defaults for the collection is used if omitted)
`tlogReplicas`::
The number of `tlog` replicas that should be created for the new shard (optional, the defaults for the collection is used if omitted)
`pullReplicas`::
The number of `pull` replicas that should be created for the new shard (optional, the defaults for the collection is used if omitted)
`property._name_=_value_`::
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details on supported properties and values.
`waitForFinalState`::
If `true`, the request will complete only when all affected replicas become active. The default is `false`, which means that the API will return the status of the single action, which may be before the new replica is online and active.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
=== CREATESHARD Response
The output will include the status of the request. If the status is anything other than "success", an error message will explain why the request failed.
=== Examples using CREATESHARD
*Input*
Create 'shard-z' for the "anImplicitCollection" collection.
[source,text]
----
http://localhost:8983/solr/admin/collections?action=CREATESHARD&collection=anImplicitCollection&shard=shard-z&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">558</int>
</lst>
</response>
----
[[deleteshard]]
== DELETESHARD: Delete a Shard
Deleting a shard will unload all replicas of the shard, remove them from `clusterstate.json`, and (by default) delete the instanceDir and dataDir for each replica. It will only remove shards that are inactive, or which have no range given for custom sharding.
`/admin/collections?action=DELETESHARD&shard=_shardID_&collection=_name_`
=== DELETESHARD Parameters
`collection`::
The name of the collection that includes the shard to be deleted. This parameter is required.
`shard`::
The name of the shard to be deleted. This parameter is required.
`deleteInstanceDir`::
By default Solr will delete the entire instanceDir of each replica that is deleted. Set this to `false` to prevent the instance directory from being deleted.
`deleteDataDir`::
By default Solr will delete the dataDir of each replica that is deleted. Set this to `false` to prevent the data directory from being deleted.
`deleteIndex`::
By default Solr will delete the index of each replica that is deleted. Set this to `false` to prevent the index directory from being deleted.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
=== DELETESHARD Response
The output will include the status of the request. If the status is anything other than "success", an error message will explain why the request failed.
=== Examples using DELETESHARD
*Input*
Delete 'shard1' of the "anotherCollection" collection.
[source,text]
----
http://localhost:8983/solr/admin/collections?action=DELETESHARD&collection=anotherCollection&shard=shard1&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">558</int>
</lst>
<lst name="success">
<lst name="10.0.1.4:8983_solr">
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">27</int>
</lst>
</lst>
</lst>
</response>
----
[[forceleader]]
== FORCELEADER: Force Shard Leader
In the unlikely event of a shard losing its leader, this command can be invoked to force the election of a new leader.
`/admin/collections?action=FORCELEADER&collection=<collectionName>&shard=<shardName>`
=== FORCELEADER Parameters
`collection`::
The name of the collection. This parameter is required.
`shard`::
The name of the shard where leader election should occur. This parameter is required.
WARNING: This is an expert level command, and should be invoked only when regular leader election is not working. This may potentially lead to loss of data in the event that the new leader doesn't have certain updates, possibly recent ones, which were acknowledged by the old leader before going down.

View File

@ -92,7 +92,7 @@ By default all replicas serve queries. See the section <<distributed-requests.ad
== Document Routing
Solr offers the ability to specify the router implementation used by a collection by specifying the `router.name` parameter when <<collections-api.adoc#create,creating your collection>>.
Solr offers the ability to specify the router implementation used by a collection by specifying the `router.name` parameter when <<collection-management.adoc#create,creating your collection>>.
If you use the `compositeId` router (the default), you can send documents with a prefix in the document ID which will be used to calculate the hash Solr uses to determine the shard a document is sent to for indexing. The prefix can be anything you'd like it to be (it doesn't have to be the shard name, for example), but it must be consistent so Solr behaves consistently.
@ -116,7 +116,7 @@ When you create a collection in SolrCloud, you decide on the initial number shar
The ability to split shards is in the Collections API. It currently allows splitting a shard into two pieces. The existing shard is left as-is, so the split action effectively makes two copies of the data as new shards. You can delete the old shard at a later time when you're ready.
More details on how to use shard splitting is in the section on the Collection API's <<collections-api.adoc#splitshard,SPLITSHARD command>>.
More details on how to use shard splitting is in the section on the Collection API's <<shard-management.adoc#splitshard,SPLITSHARD command>>.
== Ignoring Commits from Client Applications in SolrCloud

View File

@ -686,7 +686,7 @@ bin/solr zk upconfig -z 111.222.333.444:2181 -n mynewconfig -d /path/to/configse
.Reload Collections When Changing Configurations
[WARNING]
====
This command does *not* automatically make changes effective! It simply uploads the configuration sets to ZooKeeper. You can use the Collection API's <<collections-api.adoc#reload,RELOAD command>> to reload any collections that uses this configuration set.
This command does *not* automatically make changes effective! It simply uploads the configuration sets to ZooKeeper. You can use the Collection API's <<collection-management.adoc#reload,RELOAD command>> to reload any collections that uses this configuration set.
====
=== Download a Configuration Set

View File

@ -61,11 +61,11 @@ For more information about the new parameter, see the section <<format-of-solr-x
* The CREATE command will now return the appropriate status code (4xx, 5xx, etc.) when the command has failed. Previously, it always returned `0`, even in failure.
* The MODIFYCOLLECTION command now accepts an attribute to set a collection as read-only. This can be used to block a collection from receiving any updates while still allowing queries to be served. See the section <<collections-api.adoc#modifycollection,MODIFYCOLLECTION>> for details on how to use it.
* The MODIFYCOLLECTION command now accepts an attribute to set a collection as read-only. This can be used to block a collection from receiving any updates while still allowing queries to be served. See the section <<collection-management.adoc#modifycollection,MODIFYCOLLECTION>> for details on how to use it.
* A new command RENAME allows renaming a collection by setting up a one-to-one alias using the new name. For more information, see the section <<collections-api.adoc#rename,RENAME>>.
* A new command RENAME allows renaming a collection by setting up a one-to-one alias using the new name. For more information, see the section <<collection-management.adoc#rename,RENAME>>.
* A new command REINDEXCOLLECTION allows indexing existing stored fields from a source collection into a new collection. For more information, please see the section <<collections-api.adoc#reindexcollection,REINDEXCOLLECTION>>.
* A new command REINDEXCOLLECTION allows indexing existing stored fields from a source collection into a new collection. For more information, please see the section <<collection-management.adoc#reindexcollection,REINDEXCOLLECTION>>.
*Logging*