Ref Guide: copy edit new autoscaling pages; fixes in lots of other pages for "eg" and "ie" misspellings

This commit is contained in:
Cassandra Targett 2017-10-26 13:47:12 -05:00
parent 3be50df7b6
commit 79baebb8b9
37 changed files with 208 additions and 212 deletions

View File

@ -53,7 +53,7 @@ When running the various examples mentioned through out this tutorial (i.e., `bi
Special notes are included throughout these pages. There are several types of notes:
=== Information blocks
=== Information Blocks
NOTE: These provide additional information that's useful for you to know.

View File

@ -37,7 +37,7 @@ A `TypeTokenFilterFactory` is available that creates a `TypeTokenFilter` that fi
For a complete list of the available TokenFilters, see the section <<tokenizers.adoc#tokenizers,Tokenizers>>.
== When To use a CharFilter vs. a TokenFilter
== When to Use a CharFilter vs. a TokenFilter
There are several pairs of CharFilters and TokenFilters that have related (i.e., `MappingCharFilter` and `ASCIIFoldingFilter`) or nearly identical (i.e., `PatternReplaceCharFilterFactory` and `PatternReplaceFilterFactory`) functionality and it may not always be obvious which is the best choice.

View File

@ -1545,7 +1545,7 @@ The name of the collection the replica belongs to. This parameter is required.
The name of the shard the replica belongs to. This parameter is required.
`replica`::
The replica, e.g. `core_node1`. This parameter is required.
The replica, e.g., `core_node1`. This parameter is required.
`property`::
The property to add. This will have the literal `property.` prepended to distinguish it from system-maintained properties. So these two forms are equivalent:

View File

@ -113,7 +113,7 @@ If you are on Windows machine, simply replace `zkcli.sh` with `zkcli.bat` in the
.Bootstrap with chroot
[NOTE]
====
Using the boostrap command with a zookeeper chroot in the `-zkhost` parameter, e.g. `-zkhost 127.0.0.1:2181/solr`, will automatically create the chroot path before uploading the configs.
Using the boostrap command with a zookeeper chroot in the `-zkhost` parameter, e.g., `-zkhost 127.0.0.1:2181/solr`, will automatically create the chroot path before uploading the configs.
====
=== Put Arbitrary Data into a New ZooKeeper file

View File

@ -39,14 +39,14 @@ All configuration items, can be retrieved by sending a GET request to the `/conf
curl http://localhost:8983/solr/techproducts/config
----
To restrict the returned results to a top level section, e.g. `query`, `requestHandler` or `updateHandler`, append the name of the section to the `/config` endpoint following a slash. E.g. to retrieve configuration for all request handlers:
To restrict the returned results to a top level section, e.g., `query`, `requestHandler` or `updateHandler`, append the name of the section to the `/config` endpoint following a slash. E.g. to retrieve configuration for all request handlers:
[source,bash]
----
curl http://localhost:8983/solr/techproducts/config/requestHandler
----
To further restrict returned results to a single component within a top level section, use the `componentName` request param, e.g. to return configuration for the `/select` request handler:
To further restrict returned results to a single component within a top level section, use the `componentName` request param, e.g., to return configuration for the `/select` request handler:
[source,bash]
----
@ -509,7 +509,7 @@ Directly editing any files without 'touching' the directory *will not* make it v
It is possible for components to watch for the configset 'touch' events by registering a listener using `SolrCore#registerConfListener()`.
=== Listening to config Changes
=== Listening to Config Changes
Any component can register a listener using:

View File

@ -58,7 +58,7 @@ Log levels settings are as follows:
Multiple settings at one time are allowed.
=== Log level API
=== Loglevel API
There is also a way of sending REST commands to the logging endpoint to do the same. Example:

View File

@ -47,7 +47,7 @@ Both the source and the destination of `copyField` can contain either leading or
The `copyField` command can use a wildcard (*) character in the `dest` parameter only if the `source` parameter contains one as well. `copyField` uses the matching glob from the source field for the `dest` field name into which the source content is copied.
====
Copying is done at the stream source level and no copy feeds into another copy. This means that copy fields cannot be chained i.e. _you cannot_ copy from `here` to `there` and then from `there` to `elsewhere`. However, the same source field can be copied to multiple destination fields:
Copying is done at the stream source level and no copy feeds into another copy. This means that copy fields cannot be chained i.e., _you cannot_ copy from `here` to `there` and then from `there` to `elsewhere`. However, the same source field can be copied to multiple destination fields:
[source,xml]
----

View File

@ -290,14 +290,14 @@ The `core` index will be split into two pieces and written into the two director
[source,bash]
http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&split.key=A!
Here all documents having the same route key as the `split.key` i.e. 'A!' will be split from the `core` index and written to the `targetCore`.
Here all documents having the same route key as the `split.key` i.e., 'A!' will be split from the `core` index and written to the `targetCore`.
==== Usage with ranges parameter:
[source,bash]
http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&targetCore=core2&targetCore=core3&ranges=0-1f4,1f5-3e8,3e9-5dc
This example uses the `ranges` parameter with hash ranges 0-500, 501-1000 and 1001-1500 specified in hexadecimal. Here the index will be split into three pieces with each targetCore receiving documents matching the hash ranges specified i.e. core1 will get documents with hash range 0-500, core2 will receive documents with hash range 501-1000 and finally, core3 will receive documents with hash range 1001-1500. At least one hash range must be specified. Please note that using a single hash range equal to a route key's hash range is NOT equivalent to using the `split.key` parameter because multiple route keys can hash to the same range.
This example uses the `ranges` parameter with hash ranges 0-500, 501-1000 and 1001-1500 specified in hexadecimal. Here the index will be split into three pieces with each targetCore receiving documents matching the hash ranges specified i.e., core1 will get documents with hash range 0-500, core2 will receive documents with hash range 501-1000 and finally, core3 will receive documents with hash range 1001-1500. At least one hash range must be specified. Please note that using a single hash range equal to a route key's hash range is NOT equivalent to using the `split.key` parameter because multiple route keys can hash to the same range.
The `targetCore` must already exist and must have a compatible schema with the `core` index. A commit is automatically called on the `core` index before it is split.

View File

@ -79,7 +79,7 @@ If `docValues="true"` for a field, then DocValues will automatically be used any
=== Retrieving DocValues During Search
Field values retrieved during search queries are typically returned from stored values. However, non-stored docValues fields will be also returned along with other stored fields when all fields (or pattern matching globs) are specified to be returned (e.g. "`fl=*`") for search queries depending on the effective value of the `useDocValuesAsStored` parameter for each field. For schema versions >= 1.6, the implicit default is `useDocValuesAsStored="true"`. See <<field-type-definitions-and-properties.adoc#field-type-definitions-and-properties,Field Type Definitions and Properties>> & <<defining-fields.adoc#defining-fields,Defining Fields>> for more details.
Field values retrieved during search queries are typically returned from stored values. However, non-stored docValues fields will be also returned along with other stored fields when all fields (or pattern matching globs) are specified to be returned (e.g., "`fl=*`") for search queries depending on the effective value of the `useDocValuesAsStored` parameter for each field. For schema versions >= 1.6, the implicit default is `useDocValuesAsStored="true"`. See <<field-type-definitions-and-properties.adoc#field-type-definitions-and-properties,Field Type Definitions and Properties>> & <<defining-fields.adoc#defining-fields,Defining Fields>> for more details.
When `useDocValuesAsStored="false"`, non-stored DocValues fields can still be explicitly requested by name in the <<common-query-parameters.adoc#fl-field-list-parameter,fl param>>, but will not match glob patterns (`"*"`). Note that returning DocValues along with "regular" stored fields at query time has performance implications that stored fields may not because DocValues are column-oriented and may therefore incur additional cost to retrieve for each returned document. Also note that while returning non-stored fields from DocValues, the values of a multi-valued field are returned in sorted order (and not insertion order). If you require the multi-valued fields to be returned in the original insertion order, then make your multi-valued field as stored (such a change requires re-indexing).

View File

@ -154,7 +154,7 @@ server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd clusterprop -n
server\scripts\cloud-scripts\zkcli.bat -zkhost localhost:2181 -cmd clusterprop -name urlScheme -val https
----
If you have set up your ZooKeeper cluster to use a <<taking-solr-to-production.adoc#zookeeper-chroot,chroot for Solr>>, make sure you use the correct `zkhost` string with `zkcli`, e.g. `-zkhost localhost:2181/solr`.
If you have set up your ZooKeeper cluster to use a <<taking-solr-to-production.adoc#zookeeper-chroot,chroot for Solr>>, make sure you use the correct `zkhost` string with `zkcli`, e.g., `-zkhost localhost:2181/solr`.
=== Run SolrCloud with SSL
@ -317,7 +317,7 @@ java -Djavax.net.ssl.keyStorePassword=secret -Djavax.net.ssl.keyStore=../../serv
=== Query Using curl
Use curl to query the SolrCloud collection created above, from a directory containing the PEM formatted certificate and key created above (e.g. `example/etc/`) - if you have not enabled client authentication (system property `-Djetty.ssl.clientAuth=true)`, then you can remove the `-E solr-ssl.pem:secret` option:
Use curl to query the SolrCloud collection created above, from a directory containing the PEM formatted certificate and key created above (e.g., `example/etc/`) - if you have not enabled client authentication (system property `-Djetty.ssl.clientAuth=true)`, then you can remove the `-E solr-ssl.pem:secret` option:
[source,bash]
----

View File

@ -345,7 +345,7 @@ This filter stems plural English words to their singular form.
== English Possessive Filter
This filter removes singular possessives (trailing *'s*) from words. Note that plural possessives, e.g. the *s'* in "divers' snorkels", are not removed by this filter.
This filter removes singular possessives (trailing *'s*) from words. Note that plural possessives, e.g., the *s'* in "divers' snorkels", are not removed by this filter.
*Factory class:* `solr.EnglishPossessiveFilterFactory`
@ -1608,7 +1608,7 @@ This filter splits tokens at word delimiters.
.Word Delimiter Filter has been Deprecated
[WARNING]
====
Word Delimiter Filter has been deprecated in favor of Word Delimiter Graph Filter, which is required to produce a correct token graph so that e.g. phrase queries can work correctly.
Word Delimiter Filter has been deprecated in favor of Word Delimiter Graph Filter, which is required to produce a correct token graph so that e.g., phrase queries can work correctly.
====
*Factory class:* `solr.WordDelimiterFilterFactory`

View File

@ -145,7 +145,7 @@ bin/solr restart -c -p 7574 -z localhost:9983 -s example/cloud/node2/solr
Notice that you need to specify the ZooKeeper address (`-z localhost:9983`) when starting node2 so that it can join the cluster with node1.
=== Adding a node to a cluster
=== Adding a Node to a Cluster
Adding a node to an existing cluster is a bit advanced and involves a little more understanding of Solr. Once you startup a SolrCloud cluster using the startup scripts, you can add a new node to it by:

View File

@ -82,7 +82,7 @@ The default is `<em>`.
The default is `</em>`.
`hl.encoder`::
If blank, the default, then the stored text will be returned without any escaping/encoding performed by the highlighter. If set to `html` then special HMTL/XML characters will be encoded (e.g. `&` becomes `\&amp;`). The pre/post snippet characters are never encoded.
If blank, the default, then the stored text will be returned without any escaping/encoding performed by the highlighter. If set to `html` then special HMTL/XML characters will be encoded (e.g., `&` becomes `\&amp;`). The pre/post snippet characters are never encoded.
`hl.maxAnalyzedChars`::
The character limit to look for highlights, after which no highlighting will be done. This is mostly only a performance concern for an _analysis_ based offset source since it's the slowest. See <<Schema Options and Performance Considerations>>.
@ -146,7 +146,7 @@ There are four highlighters available that can be chosen at runtime with the `hl
+
The Unified Highlighter is the newest highlighter (as of Solr 6.4), which stands out as the most flexible and performant of the options. We recommend that you try this highlighter even though it isn't the default (yet).
+
This highlighter supports the most common highlighting parameters and can handle just about any query accurately, even SpanQueries (e.g. as seen from the `surround` parser). A strong benefit to this highlighter is that you can opt to configure Solr to put more information in the underlying index to speed up highlighting of large documents; multiple configurations are supported, even on a per-field basis. There is little or no such flexibility for the other highlighters. More on this below.
This highlighter supports the most common highlighting parameters and can handle just about any query accurately, even SpanQueries (e.g., as seen from the `surround` parser). A strong benefit to this highlighter is that you can opt to configure Solr to put more information in the underlying index to speed up highlighting of large documents; multiple configurations are supported, even on a per-field basis. There is little or no such flexibility for the other highlighters. More on this below.
<<The Original Highlighter,Original Highlighter>>:: (`hl.method=original`, the default)
+
@ -304,7 +304,7 @@ If `true`, multi-valued fields will return all values in the order they were sav
`hl.payloads`::
When `hl.usePhraseHighlighter` is `true` and the indexed field has payloads but not term vectors (generally quite rare), the index's payloads will be read into the highlighter's memory index along with the postings.
+
If this may happen and you know you don't need them for highlighting (i.e. your queries don't filter by payload) then you can save a little memory by setting this to false.
If this may happen and you know you don't need them for highlighting (i.e., your queries don't filter by payload) then you can save a little memory by setting this to false.
The Original Highlighter has a plugin architecture that enables new functionality to be registered in `solrconfig.xml`. The "```techproducts```" configset shows most of these settings explicitly. You can use it as a guide to provide your own components to include a `SolrFormatter`, `SolrEncoder`, and `SolrFragmenter.`

View File

@ -101,7 +101,7 @@ Integer specifying how many backups to keep. This can be used to delete all but
The configuration files to replicate, separated by a comma.
`commitReserveDuration`::
If your commits are very frequent and your network is slow, you can tweak this parameter to increase the amount of time expected to be required to transfer data. The default is `00:00:10` i.e. 10 seconds.
If your commits are very frequent and your network is slow, you can tweak this parameter to increase the amount of time expected to be required to transfer data. The default is `00:00:10` i.e., 10 seconds.
The example below shows a possible 'master' configuration for the `ReplicationHandler`, including a fixed number of backups and an invariant setting for the `maxWriteMBPerSec` request parameter to prevent slaves from saturating its network interface

View File

@ -165,7 +165,7 @@ Expert options:
`caseFirst`:: Valid values are `lower` or `upper`. Useful to control which is sorted first when case is not ignored.
`numeric`:: (true/false) If true, digits are sorted according to numeric value, e.g. foobar-9 sorts before foobar-10. The default is false.
`numeric`:: (true/false) If true, digits are sorted according to numeric value, e.g., foobar-9 sorts before foobar-10. The default is false.
`variableTop`:: Single character or contraction. Controls what is variable for `alternate`.

View File

@ -46,7 +46,7 @@ A feature is a value, a number, that represents some quantity or quality of the
==== Normalizer
Some ranking models expect features on a particular scale. A normalizer can be used to translate arbitrary feature values into normalized values e.g. on a 0..1 or 0..100 scale.
Some ranking models expect features on a particular scale. A normalizer can be used to translate arbitrary feature values into normalized values e.g., on a 0..1 or 0..100 scale.
=== Training Models
@ -82,7 +82,7 @@ The ltr contrib module includes a <<transforming-result-documents.adoc#transform
==== Feature Selection and Model Training
Feature selection and model training take place offline and outside Solr. The ltr contrib module supports two generalized forms of models as well as custom models. Each model class's javadocs contain an example to illustrate configuration of that class. In the form of JSON files your trained model or models (e.g. different models for different customer geographies) can then be directly uploaded into Solr using provided REST APIs.
Feature selection and model training take place offline and outside Solr. The ltr contrib module supports two generalized forms of models as well as custom models. Each model class's javadocs contain an example to illustrate configuration of that class. In the form of JSON files your trained model or models (e.g., different models for different customer geographies) can then be directly uploaded into Solr using provided REST APIs.
[cols=",,",options="header",]
|===
@ -609,7 +609,7 @@ The feature store and the model store are both <<managed-resources.adoc#managed-
* Conventions used:
** `<store>.json` file contains features for the `<store>` feature store
** `<model>.json` file contains model name `<model>`
** a 'generation' id (e.g. `YYYYMM` year-month) is part of the feature store and model names
** a 'generation' id (e.g., `YYYYMM` year-month) is part of the feature store and model names
** The model's features and weights are sorted alphabetically by name, this makes it easy to see what the commonalities and differences between the two models are.
** The stores features are sorted alphabetically by name, this makes it easy to see what the commonalities and differences between the two feature stores are.

View File

@ -299,10 +299,10 @@ and re-reported elsewhere as necessary.
In case of shard reporter the target node is the shard leader, in case of cluster reporter the
target node is the Overseer leader.
=== Shard reporter
=== Shard Reporter
This reporter uses predefined `shard` group, and the implementing class must be (a subclass of)
`solr.SolrShardReporter`. It publishes selected metrics from replicas to the node where shard leader is
located. Reports use a target registry name that is the replica's registry name with a `.leader` suffix, eg. for a
located. Reports use a target registry name that is the replica's registry name with a `.leader` suffix, e.g., for a
SolrCore name `collection1_shard1_replica_n3` the target registry name is
`solr.core.collection1.shard1.replica_n3.leader`.
@ -334,7 +334,7 @@ Example configuration:
</reporter>
----
=== Cluster reporter
=== Cluster Reporter
This reporter uses predefined `cluster` group and the implementing class must be (a subclass of)
`solr.SolrClusterReporter`. It publishes selected metrics from any local registry to the Overseer leader node.

View File

@ -61,7 +61,7 @@ One point of confusion is how much data is contained in a tlog. A tlog does not
WARNING: Implicit in the above is that transaction logs will grow forever if hard commits are disabled. Therefore it is important that hard commits be enabled when indexing.
=== Configuring commits
=== Configuring Commits
As mentioned above, it is usually preferable to configure your commits (both hard and soft) in `solrconfig.xml` and avoid sending commits from an external source. Check your `solrconfig.xml` file since the defaults are likely not tuned to your needs. Here is an example NRT configuration for the two flavors of commit, a hard commit every 60 seconds and a soft commit every 30 seconds. Note that these are _not_ the values in some of the examples!

View File

@ -40,7 +40,7 @@ When modifying the schema with the API, a core reload will automatically occur i
.Re-index after schema modifications!
[IMPORTANT]
====
If you modify your schema, you will likely need to re-index all documents. If you do not, you may lose access to documents, or not be able to interpret them properly, e.g. after replacing a field type.
If you modify your schema, you will likely need to re-index all documents. If you do not, you may lose access to documents, or not be able to interpret them properly, e.g., after replacing a field type.
Modifying your schema will never modify any documents that are already indexed. You must re-index documents in order to apply schema changes to them. Queries and updates made after the change may encounter errors that were not present before the change. Completely deleting the index and rebuilding it is usually the only option to fix such errors.
====
@ -595,7 +595,7 @@ If neither the `fl` query parameter nor the `fieldname` path parameter is specif
If `false`, the default, matching dynamic fields will not be returned.
`showDefaults`::
If `true`, all default field properties from each field's field type will be included in the response (e.g. `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included.
If `true`, all default field properties from each field's field type will be included in the response (e.g., `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included.
==== List Fields Response
@ -670,7 +670,7 @@ The query parameters can be added to the API request after a '?'.
Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default.
`showDefaults`::
If `true`, all default field properties from each dynamic field's field type will be included in the response (e.g. `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included.
If `true`, all default field properties from each dynamic field's field type will be included in the response (e.g., `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included.
==== List Dynamic Field Response
@ -753,7 +753,7 @@ The query parameters can be added to the API request after a '?'.
Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default.
`showDefaults`::
If `true`, all default field properties from each dynamic field's field type will be included in the response (e.g. `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included.
If `true`, all default field properties from each dynamic field's field type will be included in the response (e.g., `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included.
==== List Field Type Response

View File

@ -463,7 +463,7 @@ Other collections can share the same configuration by specifying the name of the
==== Data-driven Schema and Shared Configurations
The `_default` schema can mutate as data is indexed, since it has schemaless functionality (i.e. data-driven changes to the schema). Consequently, we recommend that you do not share data-driven configurations between collections unless you are certain that all collections should inherit the changes made when indexing data into one of the collections. You can turn off schemaless functionality (i.e. data-driven changes to the schema) for a collection by the following (assuming the collection name is `mycollection`):
The `_default` schema can mutate as data is indexed, since it has schemaless functionality (i.e., data-driven changes to the schema). Consequently, we recommend that you do not share data-driven configurations between collections unless you are certain that all collections should inherit the changes made when indexing data into one of the collections. You can turn off schemaless functionality (i.e., data-driven changes to the schema) for a collection by the following (assuming the collection name is `mycollection`):
`curl http://host:8983/solr/mycollection/config -d '{"set-user-property": {"update.autoCreateFields":"false"}}'`

View File

@ -24,7 +24,7 @@ The Autoscaling API is used to manage autoscaling policies, preferences, trigger
== Read API
The autoscaling Read API is available at `/admin/autoscaling` or `/solr/cluster/autoscaling`. It returns information about the configured cluster preferences, cluster policy, collection-specific policies triggers and listeners.
The autoscaling Read API is available at `/solr/admin/autoscaling` or `/api/cluster/autoscaling` (v2 API style). It returns information about the configured cluster preferences, cluster policy, collection-specific policies triggers and listeners.
This API does not take any parameters.
@ -145,26 +145,27 @@ However, since the first node in the first example had more than 1 replica for a
In the above example the node with port 8983 has two replicas for `shard1` in violation of our policy.
== History API
History of autoscaling events is available at `/admin/autoscaling/history`. It returns information
The history of autoscaling events is available at `/admin/autoscaling/history`. It returns information
about past autoscaling events and details about their processing. This history is kept in
the `.system` collection, and populated by a trigger listener `SystemLogListener` - by default this
the `.system` collection, and is populated by a trigger listener `SystemLogListener`. By default this
listener is added to all new triggers.
History events are regular Solr documents so they can be also accessed directly by
searching on the `.system` collection. History handler acts as a regular search handler, so all
searching on the `.system` collection. The history handler acts as a regular search handler, so all
query parameters supported by `/select` handler for that collection are supported here too.
However, the history handler makes this
process easier by offering a simpler syntax and knowledge of field names
used by `SystemLogListener` for serialization of event data.
History documents contain also the action context, if it was available, which gives
further insight into e.g. exact operations that were computed and/or executed.
History documents contain the action context, if it was available, which gives
further insight into e.g., exact operations that were computed and/or executed.
Specifically, the following query parameters can be used (they are turned into
filter queries, so an implicit AND is applied):
* `trigger` - trigger name
* `eventType` - event type / trigger type (eg. `nodeAdded`)
* `eventType` - event type / trigger type (e.g., `nodeAdded`)
* `collection` - collection name involved in event processing
* `stage` - event processing stage
* `action` - trigger action
@ -236,18 +237,18 @@ filter queries, so an implicit AND is applied):
.Broken v2 API support
[WARNING]
====
Due to a bug in Solr 7.1.0, the History API is available only at the path /admin/autoscaling/history. Using the /api/cluster/autoscaling/history endpoint returns an error.
Due to a bug in Solr 7.1.0, the History API is available only at the path `/admin/autoscaling/history`. Using the `/api/cluster/autoscaling/history` endpoint returns an error.
====
== Write API
The Write API is available at the same `/admin/autoscaling` and `/api/cluster/autoscaling` endpoints as the read API but can only be used with the *POST* HTTP verb.
The Write API is available at the same `/admin/autoscaling` and `/api/cluster/autoscaling` endpoints as the Read API but can only be used with the *POST* HTTP verb.
The payload of the POST request is a JSON message with commands to set and remove components. Multiple commands can be specified together in the payload. The commands are executed in the order specified and the changes are atomic, i.e., either all succeed or none.
=== Create and Modify Cluster Preferences
Cluster preferences are specified as a list of sort preferences. Multiple sorting preferences can be specified and they are applied in order.
Cluster preferences are specified as a list of sort preferences. Multiple sorting preferences can be specified and they are applied in the order they are set.
They are defined using the `set-cluster-preferences` command.
@ -336,7 +337,7 @@ Refer to the <<solrcloud-autoscaling-policy-preferences.adoc#policy-specificatio
}
----
Output:
*Output*:
[source,json]
----
{
@ -422,7 +423,7 @@ If you attempt to remove a policy that is being used by a collection, this comma
=== Create/Update Trigger
The set-trigger command can be used to create a new trigger or overwrite an existing one.
The `set-trigger` command can be used to create a new trigger or overwrite an existing one.
You can see the section <<solrcloud-autoscaling-triggers.adoc#trigger-configuration,Trigger Configuration>> for a full list of configuration options.
@ -464,7 +465,7 @@ You can see the section <<solrcloud-autoscaling-triggers.adoc#trigger-configurat
=== Remove Trigger
The remove-trigger command can be used to remove a trigger. It accepts a single parameter: the name of the trigger.
The `remove-trigger` command can be used to remove a trigger. It accepts a single parameter: the name of the trigger.
.Removing the nodeLost Trigger
[source,json]
@ -478,7 +479,7 @@ The remove-trigger command can be used to remove a trigger. It accepts a single
=== Create/Update Trigger Listener
The set-listener command can be used to create or modify a listener for a trigger.
The `set-listener` command can be used to create or modify a listener for a trigger.
You can see the section <<solrcloud-autoscaling-listeners.adoc#listener-configuration,Trigger Listener Configuration>> for a full list of configuration options.
@ -497,7 +498,7 @@ You can see the section <<solrcloud-autoscaling-listeners.adoc#listener-configur
=== Remove Trigger Listener
The remove-listener command can be used to remove an existing listener. It accepts a single parameter: the name of the listener.
The `remove-listener` command can be used to remove an existing listener. It accepts a single parameter: the name of the listener.
.Removing the foo listener
[source,json]

View File

@ -21,28 +21,27 @@
Solr provides a way to automatically add replicas for a collection when the number of active replicas drops below
the replication factor specified at the time of the creation of the collection.
== `autoAddReplicas` parameter
== The autoAddReplicas Parameter
The boolean `autoAddReplicas` parameter can be passed to the Create Collection API to enable this feature for a given collection.
The boolean `autoAddReplicas` parameter can be passed to the CREATE command of the Collection API to enable this feature for a given collection.
.Creating a collection with autoAddReplicas feature
`http://localhost:8983/solr/admin/collections?action=CREATE&name=my_collection&numShards=1&replicationFactor=5&autoAddReplicas=true`
.Create a collection with autoAddReplicas enabled
[source,text]
http://localhost:8983/solr/admin/collections?action=CREATE&name=my_collection&numShards=1&replicationFactor=5&autoAddReplicas=true
The modify collection API can be used to enable/disable this feature for any collection.
The MODIFYCOLLECTION command can be used to enable/disable this feature for any collection.
.Modifying collection to disable autoAddReplicas feature
`http://localhost:8983/solr/admin/collections?action=MODIFYCOLLECTION&name=my_collection&autoAddReplicas=false`
.Modify collection to disable autoAddReplicas
[source,text]
http://localhost:8983/solr/admin/collections?action=MODIFYCOLLECTION&name=my_collection&autoAddReplicas=false
== Implementation using `.autoAddReplicas` trigger
== Implementation Using .autoAddReplicas Trigger
A Trigger named `.autoAddReplicas` is automatically created whenever any collection has the autoAddReplicas feature enabled.
Only one trigger is sufficient to server all collections having this feature enabled. The `.autoAddReplicas` trigger watches
for nodes that are lost from the cluster and uses the default TriggerActions to create new replicas to replace the ones
which were hosted by the lost node. If the old node comes back online, it unloads the moved replicas and is free to host other
replicas as and when required.
Since the trigger provides the autoAddReplicas feature for all collections, the suspend-trigger and resume-trigger APIs
can be used to disable and enable this feature for all collections in one API call.
Only one trigger is sufficient to serve all collections having this feature enabled. The `.autoAddReplicas` trigger watches for nodes that are lost from the cluster and uses the default `TriggerActions` to create new replicas to replace the ones which were hosted by the lost node. If the old node comes back online, it unloads the moved replicas and the node is free to host other replicas as and when required.
Since the trigger provides the autoAddReplicas feature for all collections, the `suspend-trigger` and `resume-trigger` Autoscaling API commands can be used to disable and enable this feature for all collections in one API call. See <<solrcloud-autoscaling-auto-add-replicas.adoc#>>
.Suspending autoAddReplicas for all collections
[source,json]
@ -64,13 +63,13 @@ can be used to disable and enable this feature for all collections in one API ca
}
----
== Using cluster property to enable/disable autoAddReplicas
== Using Cluster Property to Enable autoAddReplicas
A cluster property, also named `autoAddReplicas`, can be set to `false` to disable this feature for all collections.
If this cluster property is missing or set to `true` the autoAddReplicas is enabled for all collections.
If this cluster property is missing or set to `true`, the autoAddReplicas is enabled for all collections.
.Deprecation Warning
[WARNING]
====
Using cluster property to enable/disable autoAddReplicas is deprecated and only supported for back compatibility. Please use the suspend-trigger and resume-trigger APIs instead.
====
Using a cluster property to enable or disable autoAddReplicas is deprecated and only supported for back compatibility. Please use the `suspend-trigger` and `resume-trigger` API commands instead.
====

View File

@ -18,43 +18,44 @@
// specific language governing permissions and limitations
// under the License.
== Node added / lost markers
Since triggers execute on the node that runs Overseer, should this node go down the `nodeLost`
The autoscaling framework uses a few strategies to ensure it's able to still trigger actions in the event of unexpected changes to the system.
== Node Added or Lost Markers
Since triggers execute on the node that runs the Overseer, should the Overseer node go down the `nodeLost`
event would be lost because there would be no mechanism to generate it. Similarly, if a node has
been added between the Overseer leader change was completed the `nodeAdded` event would not be
been added before the Overseer leader change was completed, the `nodeAdded` event would not be
generated.
For this reason Solr implements additional mechanisms to ensure that these events are generated
reliably.
When a node joins a cluster its presence is marked as an ephemeral ZK path in the `/live_nodes/<nodeName>`
ZooKeeper directory, but now also an ephemeral path is created under `/autoscaling/nodeAdded/<nodeName>`.
With standard SolrCloud behavior, when a node joins a cluster its presence is marked as an ephemeral ZooKeeper path in the `/live_nodes/<nodeName>` ZooKeeper directory. Now an ephemeral path is also created under `/autoscaling/nodeAdded/<nodeName>`.
When a new instance of Overseer leader is started it will run the `nodeAdded` trigger (if it's configured)
and discover the presence of this ZK path, at which point it will remove it and generate a `nodeAdded` event.
and discover the presence of this ZooKeeper path, at which point it will remove it and generate a `nodeAdded` event.
When a node leaves the cluster up to three remaining nodes will try to create a persistent ZK path
When a node leaves the cluster, up to three remaining nodes will try to create a persistent ZooKeeper path
`/autoscaling/nodeLost/<nodeName>` and eventually one of them succeeds. When a new instance of Overseer leader
is started it will run the `nodeLost` trigger (if it's configured) and discover the presence of this ZK
is started it will run the `nodeLost` trigger (if it's configured) and discover the presence of this ZooKeeper
path, at which point it will remove it and generate a `nodeLost` event.
== Trigger state checkpointing
Triggers generate events based on their internal state. If Overseer leader goes down while the trigger is
== Trigger State Checkpointing
Triggers generate events based on their internal state. If the Overseer leader goes down while the trigger is
about to generate a new event, it's likely that the event would be lost because a new trigger instance
running on the new Overseer leader would start from a clean slate.
For this reason after each time a trigger is executed its internal state is persisted to ZooKeeper, and
For this reason, after each time a trigger is executed its internal state is persisted to ZooKeeper, and
on Overseer start its internal state is restored.
== Trigger event queues
== Trigger Event Queues
Autoscaling framework limits the rate at which events are processed using several different mechanisms.
One is the locking mechanism that prevents concurrent
processing of events, and another is a single-threaded executor that runs trigger actions.
This means that the processing of an event may take significant time, and during this time it's possible that
This means that the processing of an event may take significant time, and during this time it's possible that the
Overseer may go down. In order to avoid losing events that were already generated but not yet fully
processed events are queued before processing is started.
processed, events are queued before processing is started.
Separate ZooKeeper queues are created for each trigger, and events produced by triggers are put on these
per-trigger queues. When a new Overseer leader is started it will first check
these queues and process events accumulated there, and only then it will continue to run triggers
normally. Queued events that fail processing during this "replay" stage are discarded.
normally. Queued events that fail processing during this "replay" stage are discarded.

View File

@ -18,19 +18,18 @@
// specific language governing permissions and limitations
// under the License.
Trigger listener API allows users to provide additional behavior related to trigger events as they are being processed.
Trigger Listeners allow users to configure additional behavior related to trigger events as they are being processed.
For example, users may want to record autoscaling events to an external system, or notify administrator when a
particular type of event occurs, or when its processing reaches certain stage (eg. failed).
For example, users may want to record autoscaling events to an external system, or notify an administrator when a
particular type of event occurs or when its processing reaches certain stage (e.g., failed).
Listener configuration always refers to a specific trigger configuration - listener is notified of
Listener configuration always refers to a specific trigger configuration because a listener is notified of
events generated by that specific trigger. Several (or none) named listeners can be registered for a trigger,
and they will be notified in the order in which they were defined.
Listener configuration can specify what processing stages are of interest - when an event enters this processing stage
the listener will be notified. Currently the following stages are recognized:
Listener configuration can specify what processing stages are of interest, and when an event enters this processing stage the listener will be notified. Currently the following stages are recognized:
* STARTED - when event has been generated by a trigger and its processing is starting.
* STARTED - when an event has been generated by a trigger and its processing is starting.
* ABORTED - when event was being processed while the source trigger closed.
* BEFORE_ACTION - when a `TriggerAction` is about to be invoked. Action name and the current `ActionContext` are passed to the listener.
* AFTER_ACTION - after a `TriggerAction` has been successfully invoked. Action name, `ActionContext` and the list of action
@ -38,28 +37,27 @@ the listener will be notified. Currently the following stages are recognized:
* FAILED - when event processing failed (or when a `TriggerAction` failed)
* SUCCEEDED - when event processing completes successfully
Listener configuration can also specify what particular actions are of interest, both
before and/or after they are invoked.
Listener configuration can also specify what particular actions are of interest, both before and/or after they are invoked.
== Listener configuration
== Listener Configuration
Currently the following listener configuration properties are supported:
* `name` - (string, required) unique listener configuration name.
* `trigger` - (string, required) name of an existing trigger configuration.
* `class` - (string, required) listener implementation class name.
* `stage` - (list of strings, optional, ignored case) list of processing stages that
* `name` - (string, required) A unique listener configuration name.
* `trigger` - (string, required) The name of an existing trigger configuration.
* `class` - (string, required) A listener implementation class name.
* `stage` - (list of strings, optional, ignored case) A list of processing stages that
this listener should be notified. Default is empty list.
* `beforeAction` - (list of strings, optional) list of action names (as defined in trigger configuration) before
* `beforeAction` - (list of strings, optional) A list of action names (as defined in trigger configuration) before
which the listener will be notified. Default is empty list.
* `afterAction` - (list of strings, optional) list of action names after which the listener will be notified.
* `afterAction` - (list of strings, optional) A list of action names after which the listener will be notified.
Default is empty list.
* additional implementation-specific properties may be provided.
* Additional implementation-specific properties may be provided.
Note: when both `stage` and `beforeAction` / `afterAction` lists are non-empty then the listener will be notified both
when a specified stage is entered and before / after specified actions.
=== Managing listener configurations
Listener configurations can be managed using autoscaling Write API, and using `set-listener` and `remove-listener`
=== Managing Listener Configurations
Listener configurations can be managed using the Autoscaling Write API, and using `set-listener` and `remove-listener`
commands.
For example:
@ -85,17 +83,15 @@ For example:
}
----
== Listener Implementations
Trigger listeners must implement the `TriggerListener` interface. Solr provides some
implementations of trigger listeners, which cover common use cases. These implementations are described below, together with their configuration parameters.
== Listener implementations
Trigger listeners must implement `TriggerListener` interface. Solr provides some
implementations of trigger listeners, which cover common use cases. These implementations are described in sections
below, together with their configuration parameters.
=== `SystemLogListener`
=== SystemLogListener
This trigger listener sends trigger events and processing context as documents for indexing in
SolrCloud `.system` collection.
When a trigger configuration is first created a corresponding trigger listener configuration that
When a trigger configuration is first created, a corresponding trigger listener configuration that
uses `SystemLogListener` is also automatically created, to make sure that all events and
actions related to the autoscaling framework are logged to the `.system` collection.
@ -121,7 +117,6 @@ Documents created by this listener have several predefined fields:
* `event_str` - JSON representation of all event properties
* `context_str` - JSON representation of all `ActionContext` properties, if available
The following fields are created using the information from trigger event:
* `event.id_s` - event id
@ -145,23 +140,22 @@ trigger named `foo`):
}
----
=== `HttpTriggerListener`
This listener uses HTTP POST to send a representation of event and context to a specified URL.
URL, payload and headers may contain property substitution patterns, which are then replaced with values takes from the
current event or context properties.
=== HttpTriggerListener
This listener uses HTTP POST to send a representation of the event and context to a specified URL.
The URL, payload, and headers may contain property substitution patterns, which are then replaced with values taken from the current event or context properties.
Templates use the same syntax as property substitution in Solr configuration files, eg.
Templates use the same syntax as property substitution in Solr configuration files, e.g.,
`${foo.bar:baz}` means that the value of `foo.bar` property should be taken, and `baz` should be used
if the value is absent.
Supported configuration properties:
* `url` - (string, required) a URL template
* `payload` - (string, optional) payload template. If absent a JSON map of all properties listed above will be used.
* `contentType` - (string, optional) payload content type. If absent then application/json will be used.
* `header.*` - (string, optional) header template(s). The name of the property without "header." prefix defines the literal header name.
* `timeout` - (int, optional) connection and socket timeout in milliseconds. Default is 60 seconds.
* `followRedirects` - (boolean, optional) setting to follow redirects. Default is false.
* `url` - (string, required) A URL template.
* `payload` - (string, optional) A payload template. If absent, a JSON map of all properties listed above will be used.
* `contentType` - (string, optional) A payload content type. If absent then `application/json` will be used.
* `header.*` - (string, optional) A header template(s). The name of the property without "header." prefix defines the literal header name.
* `timeout` - (int, optional) Connection and socket timeout in milliseconds. Default is `60000` milliseconds (60 seconds).
* `followRedirects` - (boolean, optional) Allows following redirects. Default is `false`.
The following properties are available in context and can be referenced from templates:
@ -173,7 +167,7 @@ The following properties are available in context and can be referenced from tem
* `error` - optional error string (from Throwable.toString())
* `message` - optional message
Example configuration:
.Example HttpTriggerListener
[source,json]
----
{
@ -185,12 +179,12 @@ Example configuration:
"header.X-Trigger": "${config.trigger}",
"payload": "actionName=${actionName}, source=${event.source}, type=${event.eventType}",
"contentType": "text/plain",
"stage": ["STARTED", "ABORTED", SUCCEEDED", "FAILED"],
"stage": ["STARTED", "ABORTED", "SUCCEEDED", "FAILED"],
"beforeAction": ["compute_plan", "execute_plan"],
"afterAction": ["compute_plan", "execute_plan"]
}
----
This configuration specifies that each time one of the listed stages is reached, or before and after each of the listed
actions is executed, the listener will send the templated payload to a URL that also depends on the config and the current event,
and with a custom header that indicates the trigger name.

View File

@ -24,56 +24,57 @@ Autoscaling in Solr aims to provide good defaults so a SolrCloud cluster remains
A simple example is automatically adding a replica for a SolrCloud collection when a node containing an existing replica goes down.
The goal of autoscaling feature is to make SolrCloud cluster management easier, automatic and intelligent. It aims to provide good defaults such that the cluster remains balanced and stable in the face of various events such as a node joining the cluster or leaving the cluster. This is achieved by satisfying a set of rules and sorting preferences that help Solr select the target of cluster management operations.
The goal of autoscaling in SolrCloud is to make cluster management easier, more automatic, and more intelligent. It aims to provide good defaults such that the cluster remains balanced and stable in the face of various events such as a node joining the cluster or leaving the cluster. This is achieved by satisfying a set of rules and sorting preferences that help Solr select the target of cluster management operations.
There are three distinct problems that this feature solves:
* When to run cluster management tasks? e.g. we might want to add a replica when an existing replica is no longer alive.
* Which cluster management task to run? e.g. do we add a new replica or should we move an existing one to a new node
* How to run the cluster management tasks such that the cluster remains balanced and stable?
* When to run cluster management tasks? For example, we might want to add a replica when an existing replica is no longer alive.
* Which cluster management task to run? For example, do we add a new replica or should we move an existing one to a new node?
* How do we run the cluster management tasks so the cluster remains balanced and stable?
Before we get into the details of how each of these problems are solved, let's take a quick look at the easiest way to setup autoscaling for your cluster.
== QuickStart: Automatically adding replicas
== Quick Start: Automatically Adding Replicas
Say that we want to create a collection which always requires us to have three replicas available for each shard all the time. We can set the replicationFactor=3 while creating the collection but what happens if a node containing one or more of the replicas either crashed or was shutdown for maintenance. In such a case, we'd like to create additional replicas to replace the ones that are no longer available to preserve the original number of replicas.
Say that we want to create a collection which always requires us to have three replicas available for each shard all the time. We can set the `replicationFactor=3` while creating the collection, but what happens if a node containing one or more of the replicas either crashed or was shutdown for maintenance? In such a case, we'd like to create additional replicas to replace the ones that are no longer available to preserve the original number of replicas.
We have an easy way to enable this behavior without needing to understand the autoscaling feature in depth. We can create a collection with such behavior by adding an additional parameter `autoAddReplicas=true` to the create collection API. For example:
We have an easy way to enable this behavior without needing to understand the autoscaling features in depth. We can create a collection with such behavior by adding an additional parameter `autoAddReplicas=true` with the CREATE command of the Collection API. For example:
`/admin/collections?action=CREATE&name=_name_of_collection_&numShards=1&replicationFactor=3&autoAddReplicas=true`
[source,text]
/admin/collections?action=CREATE&name=_name_of_collection_&numShards=1&replicationFactor=3&autoAddReplicas=true
A collection created with `autoAddReplicas=true` will be monitored by Solr such that if a node containing a replica of this collection goes down, Solr will add new replicas on other nodes after waiting for up to thirty seconds for the node to come back.
You can see the section <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas,SolrCloud AutoScaling Automatically Adding Replicas>> to learn more about how to enable or disable this feature as well as other details.
You can see the section <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas, Autoscaling Automatically Adding Replicas>> to learn more about how to enable or disable this feature as well as other details.
The selection of the node that will host the new replica is made according to the default cluster preferences that we will learn more about in the next sections.
== Cluster Preferences
Cluster preferences, as the name suggests, apply to all cluster management operations regardless of which collection they affect.
A preference is a set of conditions that help Solr select nodes that either maximize or minimize given metrics. For example, a preference such as `{minimize:cores}` will help Solr select nodes such that the number of cores on each node is minimized. We write cluster preferences in a way that reduces the overall load on the system. You can add more than one preferences to break ties.
The default cluster preferences consist of the above example (`{minimize : cores}`) which is to minimize the number of cores on all nodes.
The default cluster preferences consist of the above example (`{minimize:cores}`) which is to minimize the number of cores on all nodes.
You can learn more about preferences in the <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,Autoscaling Cluster Preferences>> section.
== Cluster Policy
A cluster policy is a set of conditions that a node, shard, or collection must satisfy before it can be chosen as the target of a cluster management operation. These conditions are applied across the cluster regardless of the collection being managed. For example, the condition `{"cores":"<10", "node":"#ANY"}` means that any node must have less than 10 Solr cores in total regardless of which collection they belong to.
A cluster policy is a set of conditions that a node, shard, or collection must satisfy before it can be chosen as the target of a cluster management operation. These conditions are applied across the cluster regardless of the collection being managed. For example, the condition `{"cores":"<10", "node":"#ANY"}` means that any node must have less than 10 Solr cores in total, regardless of which collection they belong to.
There are many metrics on which the condition can be based, e.g., system load average, heap usage, free disk space, etc. The full list of supported metrics can be found in the section describing <<solrcloud-autoscaling-policy-preferences.adoc#policy-attributes,Policy Attributes>>.
There are many metrics on which the condition can be based, e.g., system load average, heap usage, free disk space, etc. The full list of supported metrics can be found in the section describing <<solrcloud-autoscaling-policy-preferences.adoc#policy-attributes,Autoscaling Policy Attributes>>.
When a node, shard, or collection does not satisfy the policy, we call it a *violation*. Solr ensures that cluster management operations minimize the number of violations. Cluster management operations are currently invoked manually. In the future, these cluster management operations may be invoked automatically in response to cluster events such as a node being added or lost.
== Collection-Specific Policies
A collection may need conditions in addition to those specified in the cluster policy. In such cases, we can create named policies that can be used for specific collections. Firstly, we can use the `set-policy` API to create a new policy and then specify the `policy=<policy_name>` parameter to the CREATE command of the Collection API.
A collection may need conditions in addition to those specified in the cluster policy. In such cases, we can create named policies that can be used for specific collections. Firstly, we can use the `set-policy` API to create a new policy and then specify the `policy=<policy_name>` parameter to the CREATE command of the Collection API:
`/admin/collections?action=CREATE&name=coll1&numShards=1&replicationFactor=2&policy=policy1`
[source,text]
/admin/collections?action=CREATE&name=coll1&numShards=1&replicationFactor=2&policy=policy1
The above create collection command will associate a policy named `policy1` with the collection named `coll1`. Only a single policy may be associated with a collection.
The above CREATE collection command will associate a policy named `policy1` with the collection named `coll1`. Only a single policy may be associated with a collection.
Note that the collection-specific policy is applied *in addition to* the cluster policy, i.e., it is not an override but an augmentation. Therefore the collection will follow all conditions laid out in the cluster preferences, cluster policy, and the policy named `policy1`.
@ -81,9 +82,11 @@ You can learn more about collection-specific policies in the section <<solrclou
== Triggers
Now that we have an idea about how cluster management operations use policy and preferences help Solr keep the cluster balanced and stable, we can talk about when to invoke such operations. Triggers are used to watch for events such as a node joining or leaving the cluster. When the event happens, the trigger executes a set of `actions` that compute and execute a *plan* i.e. a set of operations to change the cluster so that the policy and preferences are respected.
Now that we have an idea about how cluster management operations use policies and preferences help Solr keep the cluster balanced and stable, we can talk about when to invoke such operations.
The `autoAddReplicas` parameter passed to the create collection API in the quickstart section automatically creates a trigger that watches for a node going away. When the trigger fires, it executes a set of actions that compute and execute a plan to move all replicas hosted by the lost node to new nodes in the cluster. The target nodes are chosen based on the policy and preferences.
Triggers are used to watch for events such as a node joining or leaving the cluster. When the event happens, the trigger executes a set of actions that compute and execute a *plan*, i.e., a set of operations to change the cluster so that the policy and preferences are respected.
The `autoAddReplicas` parameter passed with the CREATE Collection API command in the <<Quick Start: Automatically Adding Replicas,Quick Start>> section above automatically creates a trigger that watches for a node going away. When the trigger fires, it executes a set of actions that compute and execute a plan to move all replicas hosted by the lost node to new nodes in the cluster. The target nodes are chosen based on the policy and preferences.
You can learn more about Triggers in the section <<solrcloud-autoscaling-triggers.adoc#solrcloud-autoscaling-triggers,Autoscaling Triggers>>.
@ -95,9 +98,9 @@ You can learn more about Trigger Actions in the section <<solrcloud-autoscaling-
== Listeners
An AutoScaling *Listener* can be attached to a trigger. Solr calls the listener each time the trigger fires as well as before and after the actions performed by the trigger. Listeners are useful as a call back mechanism to perform tasks such as logging or informing external systems about events. For example, a listener is automatically added by Solr to each Trigger to log details of the trigger fire and actions to the `.system` collection.
An Autoscaling *listener* can be attached to a trigger. Solr calls the listener each time the trigger fires as well as before and after the actions performed by the trigger. Listeners are useful as a call back mechanism to perform tasks such as logging or informing external systems about events. For example, a listener is automatically added by Solr to each trigger to log details of the trigger fire and actions to the `.system` collection.
You can learn more about Listeners in the section <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,AutoScaling Listeners>>.
You can learn more about Listeners in the section <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,Autoscaling Listeners>>.
== Autoscaling APIs

View File

@ -26,17 +26,17 @@ The autoscaling policy and preferences are a set of rules and sorting preference
A preference is a hint to Solr on how to sort nodes based on their utilization. The default cluster preference is to sort by the total number of Solr cores (or replicas) hosted by a node. Therefore, by default, when selecting a node to add a replica, Solr can apply the preferences and choose the node with the least number of cores.
More than one preferences can be added to break ties. For example, we may choose to use free disk space to break ties if the number of cores on two nodes are the same so the node with the higher free disk space can be chosen as the target of the cluster operation.
More than one preference can be added to break ties. For example, we may choose to use free disk space to break ties if the number of cores on two nodes are the same. The node with the higher free disk space can be chosen as the target of the cluster operation.
Each preference is of the following form:
Each preference takes the following form:
[source,json]
{"<sort_order>":"<sort_param>", "precision":"<precision_val>"}
`sort_order`::
The value can be either `maximize` or `minimize`. `minimize` sorts the nodes with least value as the least loaded. For example, `{"minimize":"cores"}` sorts the nodes with the least number of cores as the least loaded node. A sort order such as `{"maximize":"freedisk"}` sorts the nodes with maximum free disk space as the least loaded node.
The value can be either `maximize` or `minimize`. Choose `minimize` to sort the nodes with least value as the least loaded. For example, `{"minimize":"cores"}` sorts the nodes with the least number of cores as the least loaded node. A sort order such as `{"maximize":"freedisk"}` sorts the nodes with maximum free disk space as the least loaded node.
+
The objective of the system is to make every node the least loaded. So, in case of a `MOVEREPLICA` operation, it usually targets the _most loaded_ node and takes load off of it. In a sort of more loaded to less loaded, `minimize` is akin to sort in descending order and `maximize` is akin to sorting in ascending order.
The objective of the system is to make every node the least loaded. So, in case of a `MOVEREPLICA` operation, it usually targets the _most loaded_ node and takes load off of it. In a sort of more loaded to less loaded, `minimize` is akin to sorting in descending order and `maximize` is akin to sorting in ascending order.
+
This is a required parameter.
@ -53,7 +53,7 @@ Precision tells the system the minimum (absolute) difference between 2 values to
+
For example, a precision of 10 for `freedisk` means that two nodes whose free disk space is within 10GB of each other should be treated as equal for the purpose of sorting. This helps create ties without which specifying multiple preferences is not useful. This is an optional parameter whose value must be a positive integer. The maximum value of `precision` must be less than the maximum value of the `sort_value`, if any.
See the section <<solrcloud-autoscaling-api.adoc#create-and-modify-cluster-preferences,set-cluster-preferences API>> for details on how to manage cluster preferences.
See the section <<solrcloud-autoscaling-api.adoc#create-and-modify-cluster-preferences,Create and Modify Cluster Preferences>> for details on how to manage cluster preferences with the API.
=== Examples of Cluster Preferences

View File

@ -23,14 +23,14 @@ health and good use of resources.
Currently two implementations are provided: `ComputePlanAction` and `ExecutePlanAction`.
== Compute plan action
== Compute Plan Action
The `ComputePlanAction` uses the policy and preferences to calculate the optimal set of Collection API
commands which can re-balance the cluster in response to trigger events.
Currently, it has no configurable parameters.
== Execute plan action
== Execute Plan Action
The `ExecutePlanAction` executes the Collection API commands emitted by the `ComputePlanAction` against
the cluster using SolrJ. It executes the commands serially, waiting for each of them to succeed before
@ -38,9 +38,9 @@ continuing with the next one.
Currently, it has no configurable parameters.
If any one of the command fails, then the complete chain of actions are
executed again at the next run of the trigger. If the Overseer node fails while ExecutePlanAction is running
then the new overseer node will run the chain of actions for the same event again after waiting for any
If any one of the commands fail, then the complete chain of actions are
executed again at the next run of the trigger. If the Overseer node fails while `ExecutePlanAction` is running,
then the new Overseer node will run the chain of actions for the same event again after waiting for any
running Collection API operations belonging to the event to complete.
Please see <<solrcloud-autoscaling-fault-tolerance.adoc#solrcloud-autoscaling-fault-tolerance,SolrCloud AutoScaling Fault Tolerance>> for more details on fault tolerance within the Autoscaling framework.
Please see <<solrcloud-autoscaling-fault-tolerance.adoc#solrcloud-autoscaling-fault-tolerance,SolrCloud Autoscaling Fault Tolerance>> for more details on fault tolerance within the autoscaling framework.

View File

@ -18,19 +18,20 @@
// specific language governing permissions and limitations
// under the License.
Triggers are used by autoscaling API to watch for cluster events such as node joining or leaving,
and in the future also for other cluster, node and replica events that are important from the
point of view of cluster performance.
Triggers are used in autoscaling to watch for cluster events such as nodes joining or leaving.
In the future other cluster, node, and replica events that are important from the
point of view of cluster performance will also have available triggers.
Trigger implementations verify the state of resources that they monitor. When they detect a
change that merits attention they generate events, which are then queued and processed by configured
`TriggerAction` implementations - this usually involves computing and executing a plan to manage the new cluster
resources (eg. move replicas). Solr provides predefined implementations of triggers for specific event types.
`TriggerAction` implementations. This usually involves computing and executing a plan to manage the new cluster
resources (e.g., move replicas). Solr provides predefined implementations of triggers for specific event types.
Triggers execute on the node that runs `Overseer`. They are scheduled to run periodically,
currently at fixed interval of 1s between each execution (not every execution produces events).
currently at fixed interval of 1 second between each execution (not every execution produces events).
== Event types
== Event Types
Currently the following event types (and corresponding trigger implementations) are defined:
* `nodeAdded` - generated when a new node joins the cluster
@ -41,49 +42,45 @@ maximum rate of events is controlled by the `waitFor` configuration parameter (s
The following properties are common to all event types:
* `id` - (string) unique time-based event id.
* `eventType` - (string) event type.
* `source` - (string) name of the trigger that produced this event.
* `eventTime` - (long) Unix time when the condition that caused this event occurred. For example, for
* `id` - (string) A unique time-based event id.
* `eventType` - (string) The type of event.
* `source` - (string) The name of the trigger that produced this event.
* `eventTime` - (long) Unix time when the condition that caused this event occurred. For example, for a
`nodeAdded` event this will be the time when the node was added and not when the event was actually
generated, which may significantly differ due to the rate limits set by `waitFor`.
* `properties` - (map, optional) additional properties. Currently contains `nodeName` property that
* `properties` - (map, optional) Any additional properties. Currently includes `nodeName` property that
indicates the node that was lost or added.
== `.autoAddReplicas` trigger
When a collection has a flag `autoAddReplicas` set to true then a trigger configuration named `.auto_add_replicas`
is automatically created to watch for nodes going away. This trigger produces `nodeLost` events,
== Auto Add Replicas Trigger
When a collection has the parameter `autoAddReplicas` set to true then a trigger configuration named `.auto_add_replicas` is automatically created to watch for nodes going away. This trigger produces `nodeLost` events,
which are then processed by configured actions (usually resulting in computing and executing a plan
to add replicas on the live nodes to maintain the expected replication factor).
You can see the section <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas,SolrCloud AutoScaling Automatically Adding Replicas>> to learn more about how `.autoAddReplicas` work.
You can see the section <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas, Autoscaling Automatically Adding Replicas>> to learn more about how the `.autoAddReplicas` trigger works.
== Trigger configuration
Trigger configurations are managed using autoscaling Write API with commands `set-trigger`, `remove-trigger`,
`suspend-trigger`, `resume-trigger`.
== Trigger Configuration
Trigger configurations are managed using the Autoscaling Write API and the commands `set-trigger`, `remove-trigger`,
`suspend-trigger`, and `resume-trigger`.
Trigger configuration consists of the following properties:
* `name` - (string, required) unique trigger configuration name.
* `event` - (string, required) one of predefined event types (nodeAdded, nodeLost).
* `actions` - (list of action configs, optional) ordered list of actions to execute when event is fired
* `waitFor` - (string, optional) time to wait between generating new events, as an integer number immediately followed
by unit symbol, one of "s" (seconds), "m" (minutes), or "h" (hours). Default is "0s".
* `enabled` - (boolean, optional) when true the trigger is enabled. Default is true.
* additional implementation-specific properties may be provided
* `name` - (string, required) A unique trigger configuration name.
* `event` - (string, required) One of the predefined event types (`nodeAdded` or `nodeLost`).
* `actions` - (list of action configs, optional) An ordered list of actions to execute when event is fired.
* `waitFor` - (string, optional) The time to wait between generating new events, as an integer number immediately followed by unit symbol, one of `s` (seconds), `m` (minutes), or `h` (hours). Default is `0s`.
* `enabled` - (boolean, optional) When `true` the trigger is enabled. Default is `true`.
* Additional implementation-specific properties may be provided.
Action configuration consists of the following properties:
* `name` - (string, required) unique name of the action configuration.
* `class` - (string, required) action implementation class
* additional implementation-specific properties may be provided
* `name` - (string, required) A unique name of the action configuration.
* `class` - (string, required) The action implementation class.
* A dditional implementation-specific properties may be provided
If the Action configuration is omitted, then by default, the ComputePlanAction and the ExecutePlanAction are automatically
added to the trigger configuration.
If the Action configuration is omitted, then by default, the `ComputePlanAction` and the `ExecutePlanAction` are automatically added to the trigger configuration.
Example: adding / updating a trigger for `nodeAdded` events. This trigger configuration will
compute and execute a plan to allocate the resources available on the new node. A custom action
is also used to possibly modify the plan.
.Example: adding or updating a trigger for `nodeAdded` events
[source,json]
----
{
@ -110,3 +107,4 @@ is also used to possibly modify the plan.
}
----
This trigger configuration will compute and execute a plan to allocate the resources available on the new node. A custom action is also used to possibly modify the plan.

View File

@ -1,4 +1,4 @@
= SolrCloud AutoScaling
= SolrCloud Autoscaling
:page-shortname: solrcloud-autoscaling
:page-permalink: solrcloud-autoscaling.html
:page-children: solrcloud-autoscaling-overview, solrcloud-autoscaling-api, solrcloud-autoscaling-policy-preferences, solrcloud-autoscaling-triggers, solrcloud-autoscaling-trigger-actions, solrcloud-autoscaling-listeners, solrcloud-autoscaling-auto-add-replicas, solrcloud-autoscaling-fault-tolerance
@ -27,10 +27,10 @@ Autoscaling includes an API to manage cluster-wide and collection-specific polic
The following sections describe the autoscaling features of SolrCloud:
* <<solrcloud-autoscaling-overview.adoc#solrcloud-autoscaling-overview,Overview of Autoscaling in SolrCloud>>
* <<solrcloud-autoscaling-api.adoc#solrcloud-autoscaling-api,SolrCloud Autoscaling API>>
* <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,SolrCloud Autoscaling Policy and Preferences>>
* <<solrcloud-autoscaling-triggers.adoc#solrcloud-autoscaling-triggers,SolrCloud AutoScaling Triggers>>
* <<solrcloud-autoscaling-trigger-actions.adoc#solrcloud-autoscaling-trigger-actions,SolrCloud AutoScaling Trigger Actions>>
* <<solrcloud-autoscaling-listeners.adoc#solrcloud-autoscaling-listeners,SolrCloud AutoScaling Listeners>>
* <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas,SolrCloud AutoScaling - Automatically Adding Replicas>>
* <<solrcloud-autoscaling-fault-tolerance.adoc#solrcloud-autoscaling-fault-tolerance,SolrCloud AutoScaling Fault Tolerance>>
* <<solrcloud-autoscaling-api.adoc#solrcloud-autoscaling-api,Autoscaling API>>
* <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,Autoscaling Policy and Preferences>>
* <<solrcloud-autoscaling-triggers.adoc#solrcloud-autoscaling-triggers,Autoscaling Triggers>>
* <<solrcloud-autoscaling-trigger-actions.adoc#solrcloud-autoscaling-trigger-actions,Autoscaling Trigger Actions>>
* <<solrcloud-autoscaling-listeners.adoc#solrcloud-autoscaling-listeners,Autoscaling Listeners>>
* <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas,Autoscaling - Automatically Adding Replicas>>
* <<solrcloud-autoscaling-fault-tolerance.adoc#solrcloud-autoscaling-fault-tolerance,Autoscaling Fault Tolerance>>

View File

@ -73,7 +73,7 @@ The center point using the format "lat,lon" if latitude & longitude. Otherwise,
A spatial indexed field.
`score`::
(Advanced option; not supported by LatLonType (deprecated) or PointType) If the query is used in a scoring context (e.g. as the main query in `q`), this _<<local-parameters-in-queries.adoc#local-parameters-in-queries,local parameter>>_ determines what scores will be produced. Valid values are:
(Advanced option; not supported by LatLonType (deprecated) or PointType) If the query is used in a scoring context (e.g., as the main query in `q`), this _<<local-parameters-in-queries.adoc#local-parameters-in-queries,local parameter>>_ determines what scores will be produced. Valid values are:
* `none`: A fixed score of 1.0. (the default)
* `kilometers`: distance in kilometers between the field value and the specified center point
@ -84,7 +84,7 @@ A spatial indexed field.
+
[WARNING]
====
Don't use this for indexed non-point shapes (e.g. polygons). The results will be erroneous. And with RPT, it's only recommended for multi-valued point data, as the implementation doesn't scale very well and for single-valued fields, you should instead use a separate non-RPT field purely for distance sorting.
Don't use this for indexed non-point shapes (e.g., polygons). The results will be erroneous. And with RPT, it's only recommended for multi-valued point data, as the implementation doesn't scale very well and for single-valued fields, you should instead use a separate non-RPT field purely for distance sorting.
====
+
When used with `BBoxField`, additional options are supported:
@ -129,7 +129,7 @@ Here's an example:
`&q=*:*&fq=store:[45,-94 TO 46,-93]`
LatLonType (deprecated) does *not* support rectangles that cross the dateline. For RPT and BBoxField, if you are non-geospatial coordinates (`geo="false"`) then you must quote the points due to the space, e.g. `"x y"`.
LatLonType (deprecated) does *not* support rectangles that cross the dateline. For RPT and BBoxField, if you are non-geospatial coordinates (`geo="false"`) then you must quote the points due to the space, e.g., `"x y"`.
=== Optimizing: Cache or Not
@ -193,7 +193,7 @@ RPT offers several functional improvements over LatLonPointSpatialField:
* Non-geodetic geo=false general x & y (_not_ latitude and longitude) -- if desired
* Query by polygons and other complex shapes, in addition to circles & rectangles
* Ability to index non-point shapes (e.g. polygons) as well as points see RptWithGeometrySpatialField
* Ability to index non-point shapes (e.g., polygons) as well as points see RptWithGeometrySpatialField
* Heatmap grid faceting
RPT _shares_ various features in common with `LatLonPointSpatialField`. Some are listed here:
@ -411,7 +411,7 @@ BBoxField is actually based off of 4 instances of another field type referred to
To index a box, add a field value to a bbox field that's a string in the WKT/CQL ENVELOPE syntax. Example: `ENVELOPE(-10, 20, 15, 10)` which is minX, maxX, maxY, minY order. The parameter ordering is unintuitive but that's what the spec calls for. Alternatively, you could provide a rectangular polygon in WKT (or GeoJSON if you set set `format="GeoJSON"`).
To search, you can use the `{!bbox}` query parser, or the range syntax e.g. `[10,-10 TO 15,20]`, or the ENVELOPE syntax wrapped in parenthesis with a leading search predicate. The latter is the only way to choose a predicate other than Intersects. For example:
To search, you can use the `{!bbox}` query parser, or the range syntax e.g., `[10,-10 TO 15,20]`, or the ENVELOPE syntax wrapped in parenthesis with a leading search predicate. The latter is the only way to choose a predicate other than Intersects. For example:
[source,plain]
&q={!field f=bbox}Contains(ENVELOPE(-10, 20, 15, 10))

View File

@ -22,7 +22,7 @@
== cartesianProduct
The `cartesianProduct` function turns a single tuple with a multi-valued field (ie. an array) into multiple tuples, one for each value in the array field. That is, given a single tuple containing an array of N values for fieldA, the `cartesianProduct` function will output N tuples, each with one value from the original tuple's array. In essence, you can flatten arrays for further processing.
The `cartesianProduct` function turns a single tuple with a multi-valued field (i.e., an array) into multiple tuples, one for each value in the array field. That is, given a single tuple containing an array of N values for fieldA, the `cartesianProduct` function will output N tuples, each with one value from the original tuple's array. In essence, you can flatten arrays for further processing.
For example, using `cartesianProduct` you can turn this tuple
[source,text]

View File

@ -41,7 +41,7 @@ In addition to all the <<the-dismax-query-parser.adoc#dismax-query-parser-parame
Split on whitespace. If set to `true`, text analysis is invoked separately for each individual whitespace-separated term. The default is `false`; whitespace-separated term sequences will be provided to text analysis in one shot, enabling proper function of analysis filters that operate over term sequences, e.g., multi-word synonyms and shingles.
`mm.autoRelax`::
If `true`, the number of clauses required (<<the-dismax-query-parser.adoc#mm-minimum-should-match-parameter,minimum should match>>) will automatically be relaxed if a clause is removed (by e.g. stopwords filter) from some but not all <<the-dismax-query-parser.adoc#qf-query-fields-parameter,`qf`>> fields. Use this parameter as a workaround if you experience that queries return zero hits due to uneven stopword removal between the `qf` fields.
If `true`, the number of clauses required (<<the-dismax-query-parser.adoc#mm-minimum-should-match-parameter,minimum should match>>) will automatically be relaxed if a clause is removed (by e.g., stopwords filter) from some but not all <<the-dismax-query-parser.adoc#qf-query-fields-parameter,`qf`>> fields. Use this parameter as a workaround if you experience that queries return zero hits due to uneven stopword removal between the `qf` fields.
+
Note that relaxing `mm` may cause undesired side effects, such as hurting the precision of the search, depending on the nature of your index content.

View File

@ -72,8 +72,8 @@ If you are worried about the SolrJ libraries expanding the size of your client a
Most `SolrClient` implementations (with the notable exception of `CloudSolrClient`) require users to specify one or more Solr base URLs, which the client then uses to send HTTP requests to Solr. The path users include on the base URL they provide has an effect on the behavior of the created client from that point on.
. A URL with a path pointing to a specific core or collection (e.g. `http://hostname:8983/solr/core1`). When a core or collection is specified in the base URL, subsequent requests made with that client are not required to re-specify the affected collection. However, the client is limited to sending requests to that core/collection, and can not send requests to any others.
. A URL with a generic path pointing to the root Solr path (e.g. `http://hostname:8983/solr`). When no core or collection is specified in the base URL, requests can be made to any core/collection, but the affected core/collection must be specified on all requests.
. A URL with a path pointing to a specific core or collection (e.g., `http://hostname:8983/solr/core1`). When a core or collection is specified in the base URL, subsequent requests made with that client are not required to re-specify the affected collection. However, the client is limited to sending requests to that core/collection, and can not send requests to any others.
. A URL with a generic path pointing to the root Solr path (e.g., `http://hostname:8983/solr`). When no core or collection is specified in the base URL, requests can be made to any core/collection, but the affected core/collection must be specified on all requests.
== Setting XMLResponseParser

View File

@ -70,7 +70,7 @@ You must specify parameters `amountLongSuffix` and `codeStrSuffix`, correspondin
defaultCurrency="USD" currencyConfig="currency.xml" />
----
In the above example, the raw amount field will use the `"*_l_ns"` dynamic field, which must exist in the schema and use a long field type, i.e. one that extends `LongValueFieldType`. The currency code field will use the `"*_s_ns"` dynamic field, which must exist in the schema and use a string field type, i.e. one that is or extends `StrField`.
In the above example, the raw amount field will use the `"*_l_ns"` dynamic field, which must exist in the schema and use a long field type, i.e., one that extends `LongValueFieldType`. The currency code field will use the `"*_s_ns"` dynamic field, which must exist in the schema and use a string field type, i.e., one that is or extends `StrField`.
.Atomic Updates won't work if dynamic sub-fields are stored
[NOTE]

View File

@ -62,7 +62,7 @@ These are valid queries: +
Solr's `DateRangeField` supports the same point in time date syntax described above (with _date math_ described below) and more to express date ranges. One class of examples is truncated dates, which represent the entire date span to the precision indicated. The other class uses the range syntax (`[ TO ]`). Here are some examples:
* `2000-11` The entire month of November, 2000.
* `2000-11T13` Likewise but for an hour of the day (1300 to before 1400, i.e. 1pm to 2pm).
* `2000-11T13` Likewise but for an hour of the day (1300 to before 1400, i.e., 1pm to 2pm).
* `-0009` The year 10 BC. A 0 in the year position is 0 AD, and is also considered 1 BC.
* `[2000-11-01 TO 2014-12-01]` The specified date range at a day resolution.
* `[2014 TO 2014-12-01]` From the start of 2014 till the end of the first day of December.

View File

@ -179,7 +179,7 @@ Special characters in "text" values can be escaped using the escape character `\
|`\t` |horizontal tab
|===
Please note that Unicode sequences (e.g. `\u0001`) are not supported.
Please note that Unicode sequences (e.g., `\u0001`) are not supported.
==== Supported Attributes

View File

@ -51,11 +51,11 @@ We want to be able to:
. Control which ACLs Solr will add to znodes (ZooKeeper files/folders) it creates in ZooKeeper.
. Control it "from the outside", so that you do not have to modify and/or recompile Solr code to turn this on.
Solr nodes, clients and tools (e.g. ZkCLI) always use a java class called {solr-javadocs}/solr-solrj/org/apache/solr/common/cloud/SolrZkClient.html[`SolrZkClient`] to deal with their ZooKeeper stuff. The implementation of the solution described here is all about changing `SolrZkClient`. If you use `SolrZkClient` in your application, the descriptions below will be true for your application too.
Solr nodes, clients and tools (e.g., ZkCLI) always use a java class called {solr-javadocs}/solr-solrj/org/apache/solr/common/cloud/SolrZkClient.html[`SolrZkClient`] to deal with their ZooKeeper stuff. The implementation of the solution described here is all about changing `SolrZkClient`. If you use `SolrZkClient` in your application, the descriptions below will be true for your application too.
=== Controlling Credentials
You control which credentials provider will be used by configuring the `zkCredentialsProvider` property in `solr.xml` 's `<solrcloud>` section to the name of a class (on the classpath) implementing the {solr-javadocs}/solr-solrj/org/apache/solr/common/cloud/ZkCredentialsProvider[`ZkCredentialsProvider`] interface. `server/solr/solr.xml` in the Solr distribution defines the `zkCredentialsProvider` such that it will take on the value of the same-named `zkCredentialsProvider` system property if it is defined (e.g. by uncommenting the `SOLR_ZK_CREDS_AND_ACLS` environment variable definition in `solr.in.sh/.cmd` - see below), or if not, default to the `DefaultZkCredentialsProvider` implementation.
You control which credentials provider will be used by configuring the `zkCredentialsProvider` property in `solr.xml` 's `<solrcloud>` section to the name of a class (on the classpath) implementing the {solr-javadocs}/solr-solrj/org/apache/solr/common/cloud/ZkCredentialsProvider[`ZkCredentialsProvider`] interface. `server/solr/solr.xml` in the Solr distribution defines the `zkCredentialsProvider` such that it will take on the value of the same-named `zkCredentialsProvider` system property if it is defined (e.g., by uncommenting the `SOLR_ZK_CREDS_AND_ACLS` environment variable definition in `solr.in.sh/.cmd` - see below), or if not, default to the `DefaultZkCredentialsProvider` implementation.
==== Out of the Box Credential Implementations
@ -68,7 +68,7 @@ You can always make you own implementation, but Solr comes with two implementati
=== Controlling ACLs
You control which ACLs will be added by configuring `zkACLProvider` property in `solr.xml` 's `<solrcloud>` section to the name of a class (on the classpath) implementing the {solr-javadocs}//solr-solrj/org/apache/solr/common/cloud/ZkACLProvider[`ZkACLProvider`] interface. `server/solr/solr.xml` in the Solr distribution defines the `zkACLProvider` such that it will take on the value of the same-named `zkACLProvider` system property if it is defined (e.g. by uncommenting the `SOLR_ZK_CREDS_AND_ACLS` environment variable definition in `solr.in.sh/.cmd` - see below), or if not, default to the `DefaultZkACLProvider` implementation.
You control which ACLs will be added by configuring `zkACLProvider` property in `solr.xml` 's `<solrcloud>` section to the name of a class (on the classpath) implementing the {solr-javadocs}//solr-solrj/org/apache/solr/common/cloud/ZkACLProvider[`ZkACLProvider`] interface. `server/solr/solr.xml` in the Solr distribution defines the `zkACLProvider` such that it will take on the value of the same-named `zkACLProvider` system property if it is defined (e.g., by uncommenting the `SOLR_ZK_CREDS_AND_ACLS` environment variable definition in `solr.in.sh/.cmd` - see below), or if not, default to the `DefaultZkACLProvider` implementation.
==== Out of the Box ACL Implementations
@ -90,7 +90,7 @@ If none of the above ACLs is added to the list, the (empty) ACL list of `Default
Notice the overlap in system property names with credentials provider `VMParamsSingleSetCredentialsDigestZkCredentialsProvider` (described above). This is to let the two providers collaborate in a nice and perhaps common way: we always protect access to content by limiting to two users - an admin-user and a readonly-user - AND we always connect with credentials corresponding to this same admin-user, basically so that we can do anything to the content/znodes we create ourselves.
You can give the readonly credentials to "clients" of your SolrCloud cluster - e.g. to be used by SolrJ clients. They will be able to read whatever is necessary to run a functioning SolrJ client, but they will not be able to modify any content in ZooKeeper.
You can give the readonly credentials to "clients" of your SolrCloud cluster - e.g., to be used by SolrJ clients. They will be able to read whatever is necessary to run a functioning SolrJ client, but they will not be able to modify any content in ZooKeeper.
=== ZooKeeper ACLs in Solr Scripts