Ref Guide: copy edit new autoscaling pages; fixes in lots of other pages for "eg" and "ie" misspellings

This commit is contained in:
Cassandra Targett 2017-10-26 13:47:12 -05:00
parent 3be50df7b6
commit 79baebb8b9
37 changed files with 208 additions and 212 deletions

View File

@ -53,7 +53,7 @@ When running the various examples mentioned through out this tutorial (i.e., `bi
Special notes are included throughout these pages. There are several types of notes: Special notes are included throughout these pages. There are several types of notes:
=== Information blocks === Information Blocks
NOTE: These provide additional information that's useful for you to know. NOTE: These provide additional information that's useful for you to know.

View File

@ -37,7 +37,7 @@ A `TypeTokenFilterFactory` is available that creates a `TypeTokenFilter` that fi
For a complete list of the available TokenFilters, see the section <<tokenizers.adoc#tokenizers,Tokenizers>>. For a complete list of the available TokenFilters, see the section <<tokenizers.adoc#tokenizers,Tokenizers>>.
== When To use a CharFilter vs. a TokenFilter == When to Use a CharFilter vs. a TokenFilter
There are several pairs of CharFilters and TokenFilters that have related (i.e., `MappingCharFilter` and `ASCIIFoldingFilter`) or nearly identical (i.e., `PatternReplaceCharFilterFactory` and `PatternReplaceFilterFactory`) functionality and it may not always be obvious which is the best choice. There are several pairs of CharFilters and TokenFilters that have related (i.e., `MappingCharFilter` and `ASCIIFoldingFilter`) or nearly identical (i.e., `PatternReplaceCharFilterFactory` and `PatternReplaceFilterFactory`) functionality and it may not always be obvious which is the best choice.

View File

@ -1545,7 +1545,7 @@ The name of the collection the replica belongs to. This parameter is required.
The name of the shard the replica belongs to. This parameter is required. The name of the shard the replica belongs to. This parameter is required.
`replica`:: `replica`::
The replica, e.g. `core_node1`. This parameter is required. The replica, e.g., `core_node1`. This parameter is required.
`property`:: `property`::
The property to add. This will have the literal `property.` prepended to distinguish it from system-maintained properties. So these two forms are equivalent: The property to add. This will have the literal `property.` prepended to distinguish it from system-maintained properties. So these two forms are equivalent:

View File

@ -113,7 +113,7 @@ If you are on Windows machine, simply replace `zkcli.sh` with `zkcli.bat` in the
.Bootstrap with chroot .Bootstrap with chroot
[NOTE] [NOTE]
==== ====
Using the boostrap command with a zookeeper chroot in the `-zkhost` parameter, e.g. `-zkhost 127.0.0.1:2181/solr`, will automatically create the chroot path before uploading the configs. Using the boostrap command with a zookeeper chroot in the `-zkhost` parameter, e.g., `-zkhost 127.0.0.1:2181/solr`, will automatically create the chroot path before uploading the configs.
==== ====
=== Put Arbitrary Data into a New ZooKeeper file === Put Arbitrary Data into a New ZooKeeper file

View File

@ -39,14 +39,14 @@ All configuration items, can be retrieved by sending a GET request to the `/conf
curl http://localhost:8983/solr/techproducts/config curl http://localhost:8983/solr/techproducts/config
---- ----
To restrict the returned results to a top level section, e.g. `query`, `requestHandler` or `updateHandler`, append the name of the section to the `/config` endpoint following a slash. E.g. to retrieve configuration for all request handlers: To restrict the returned results to a top level section, e.g., `query`, `requestHandler` or `updateHandler`, append the name of the section to the `/config` endpoint following a slash. E.g. to retrieve configuration for all request handlers:
[source,bash] [source,bash]
---- ----
curl http://localhost:8983/solr/techproducts/config/requestHandler curl http://localhost:8983/solr/techproducts/config/requestHandler
---- ----
To further restrict returned results to a single component within a top level section, use the `componentName` request param, e.g. to return configuration for the `/select` request handler: To further restrict returned results to a single component within a top level section, use the `componentName` request param, e.g., to return configuration for the `/select` request handler:
[source,bash] [source,bash]
---- ----
@ -509,7 +509,7 @@ Directly editing any files without 'touching' the directory *will not* make it v
It is possible for components to watch for the configset 'touch' events by registering a listener using `SolrCore#registerConfListener()`. It is possible for components to watch for the configset 'touch' events by registering a listener using `SolrCore#registerConfListener()`.
=== Listening to config Changes === Listening to Config Changes
Any component can register a listener using: Any component can register a listener using:

View File

@ -58,7 +58,7 @@ Log levels settings are as follows:
Multiple settings at one time are allowed. Multiple settings at one time are allowed.
=== Log level API === Loglevel API
There is also a way of sending REST commands to the logging endpoint to do the same. Example: There is also a way of sending REST commands to the logging endpoint to do the same. Example:

View File

@ -47,7 +47,7 @@ Both the source and the destination of `copyField` can contain either leading or
The `copyField` command can use a wildcard (*) character in the `dest` parameter only if the `source` parameter contains one as well. `copyField` uses the matching glob from the source field for the `dest` field name into which the source content is copied. The `copyField` command can use a wildcard (*) character in the `dest` parameter only if the `source` parameter contains one as well. `copyField` uses the matching glob from the source field for the `dest` field name into which the source content is copied.
==== ====
Copying is done at the stream source level and no copy feeds into another copy. This means that copy fields cannot be chained i.e. _you cannot_ copy from `here` to `there` and then from `there` to `elsewhere`. However, the same source field can be copied to multiple destination fields: Copying is done at the stream source level and no copy feeds into another copy. This means that copy fields cannot be chained i.e., _you cannot_ copy from `here` to `there` and then from `there` to `elsewhere`. However, the same source field can be copied to multiple destination fields:
[source,xml] [source,xml]
---- ----

View File

@ -290,14 +290,14 @@ The `core` index will be split into two pieces and written into the two director
[source,bash] [source,bash]
http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&split.key=A! http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&split.key=A!
Here all documents having the same route key as the `split.key` i.e. 'A!' will be split from the `core` index and written to the `targetCore`. Here all documents having the same route key as the `split.key` i.e., 'A!' will be split from the `core` index and written to the `targetCore`.
==== Usage with ranges parameter: ==== Usage with ranges parameter:
[source,bash] [source,bash]
http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&targetCore=core2&targetCore=core3&ranges=0-1f4,1f5-3e8,3e9-5dc http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&targetCore=core2&targetCore=core3&ranges=0-1f4,1f5-3e8,3e9-5dc
This example uses the `ranges` parameter with hash ranges 0-500, 501-1000 and 1001-1500 specified in hexadecimal. Here the index will be split into three pieces with each targetCore receiving documents matching the hash ranges specified i.e. core1 will get documents with hash range 0-500, core2 will receive documents with hash range 501-1000 and finally, core3 will receive documents with hash range 1001-1500. At least one hash range must be specified. Please note that using a single hash range equal to a route key's hash range is NOT equivalent to using the `split.key` parameter because multiple route keys can hash to the same range. This example uses the `ranges` parameter with hash ranges 0-500, 501-1000 and 1001-1500 specified in hexadecimal. Here the index will be split into three pieces with each targetCore receiving documents matching the hash ranges specified i.e., core1 will get documents with hash range 0-500, core2 will receive documents with hash range 501-1000 and finally, core3 will receive documents with hash range 1001-1500. At least one hash range must be specified. Please note that using a single hash range equal to a route key's hash range is NOT equivalent to using the `split.key` parameter because multiple route keys can hash to the same range.
The `targetCore` must already exist and must have a compatible schema with the `core` index. A commit is automatically called on the `core` index before it is split. The `targetCore` must already exist and must have a compatible schema with the `core` index. A commit is automatically called on the `core` index before it is split.

View File

@ -79,7 +79,7 @@ If `docValues="true"` for a field, then DocValues will automatically be used any
=== Retrieving DocValues During Search === Retrieving DocValues During Search
Field values retrieved during search queries are typically returned from stored values. However, non-stored docValues fields will be also returned along with other stored fields when all fields (or pattern matching globs) are specified to be returned (e.g. "`fl=*`") for search queries depending on the effective value of the `useDocValuesAsStored` parameter for each field. For schema versions >= 1.6, the implicit default is `useDocValuesAsStored="true"`. See <<field-type-definitions-and-properties.adoc#field-type-definitions-and-properties,Field Type Definitions and Properties>> & <<defining-fields.adoc#defining-fields,Defining Fields>> for more details. Field values retrieved during search queries are typically returned from stored values. However, non-stored docValues fields will be also returned along with other stored fields when all fields (or pattern matching globs) are specified to be returned (e.g., "`fl=*`") for search queries depending on the effective value of the `useDocValuesAsStored` parameter for each field. For schema versions >= 1.6, the implicit default is `useDocValuesAsStored="true"`. See <<field-type-definitions-and-properties.adoc#field-type-definitions-and-properties,Field Type Definitions and Properties>> & <<defining-fields.adoc#defining-fields,Defining Fields>> for more details.
When `useDocValuesAsStored="false"`, non-stored DocValues fields can still be explicitly requested by name in the <<common-query-parameters.adoc#fl-field-list-parameter,fl param>>, but will not match glob patterns (`"*"`). Note that returning DocValues along with "regular" stored fields at query time has performance implications that stored fields may not because DocValues are column-oriented and may therefore incur additional cost to retrieve for each returned document. Also note that while returning non-stored fields from DocValues, the values of a multi-valued field are returned in sorted order (and not insertion order). If you require the multi-valued fields to be returned in the original insertion order, then make your multi-valued field as stored (such a change requires re-indexing). When `useDocValuesAsStored="false"`, non-stored DocValues fields can still be explicitly requested by name in the <<common-query-parameters.adoc#fl-field-list-parameter,fl param>>, but will not match glob patterns (`"*"`). Note that returning DocValues along with "regular" stored fields at query time has performance implications that stored fields may not because DocValues are column-oriented and may therefore incur additional cost to retrieve for each returned document. Also note that while returning non-stored fields from DocValues, the values of a multi-valued field are returned in sorted order (and not insertion order). If you require the multi-valued fields to be returned in the original insertion order, then make your multi-valued field as stored (such a change requires re-indexing).

View File

@ -154,7 +154,7 @@ server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd clusterprop -n
server\scripts\cloud-scripts\zkcli.bat -zkhost localhost:2181 -cmd clusterprop -name urlScheme -val https server\scripts\cloud-scripts\zkcli.bat -zkhost localhost:2181 -cmd clusterprop -name urlScheme -val https
---- ----
If you have set up your ZooKeeper cluster to use a <<taking-solr-to-production.adoc#zookeeper-chroot,chroot for Solr>>, make sure you use the correct `zkhost` string with `zkcli`, e.g. `-zkhost localhost:2181/solr`. If you have set up your ZooKeeper cluster to use a <<taking-solr-to-production.adoc#zookeeper-chroot,chroot for Solr>>, make sure you use the correct `zkhost` string with `zkcli`, e.g., `-zkhost localhost:2181/solr`.
=== Run SolrCloud with SSL === Run SolrCloud with SSL
@ -317,7 +317,7 @@ java -Djavax.net.ssl.keyStorePassword=secret -Djavax.net.ssl.keyStore=../../serv
=== Query Using curl === Query Using curl
Use curl to query the SolrCloud collection created above, from a directory containing the PEM formatted certificate and key created above (e.g. `example/etc/`) - if you have not enabled client authentication (system property `-Djetty.ssl.clientAuth=true)`, then you can remove the `-E solr-ssl.pem:secret` option: Use curl to query the SolrCloud collection created above, from a directory containing the PEM formatted certificate and key created above (e.g., `example/etc/`) - if you have not enabled client authentication (system property `-Djetty.ssl.clientAuth=true)`, then you can remove the `-E solr-ssl.pem:secret` option:
[source,bash] [source,bash]
---- ----

View File

@ -345,7 +345,7 @@ This filter stems plural English words to their singular form.
== English Possessive Filter == English Possessive Filter
This filter removes singular possessives (trailing *'s*) from words. Note that plural possessives, e.g. the *s'* in "divers' snorkels", are not removed by this filter. This filter removes singular possessives (trailing *'s*) from words. Note that plural possessives, e.g., the *s'* in "divers' snorkels", are not removed by this filter.
*Factory class:* `solr.EnglishPossessiveFilterFactory` *Factory class:* `solr.EnglishPossessiveFilterFactory`
@ -1608,7 +1608,7 @@ This filter splits tokens at word delimiters.
.Word Delimiter Filter has been Deprecated .Word Delimiter Filter has been Deprecated
[WARNING] [WARNING]
==== ====
Word Delimiter Filter has been deprecated in favor of Word Delimiter Graph Filter, which is required to produce a correct token graph so that e.g. phrase queries can work correctly. Word Delimiter Filter has been deprecated in favor of Word Delimiter Graph Filter, which is required to produce a correct token graph so that e.g., phrase queries can work correctly.
==== ====
*Factory class:* `solr.WordDelimiterFilterFactory` *Factory class:* `solr.WordDelimiterFilterFactory`

View File

@ -145,7 +145,7 @@ bin/solr restart -c -p 7574 -z localhost:9983 -s example/cloud/node2/solr
Notice that you need to specify the ZooKeeper address (`-z localhost:9983`) when starting node2 so that it can join the cluster with node1. Notice that you need to specify the ZooKeeper address (`-z localhost:9983`) when starting node2 so that it can join the cluster with node1.
=== Adding a node to a cluster === Adding a Node to a Cluster
Adding a node to an existing cluster is a bit advanced and involves a little more understanding of Solr. Once you startup a SolrCloud cluster using the startup scripts, you can add a new node to it by: Adding a node to an existing cluster is a bit advanced and involves a little more understanding of Solr. Once you startup a SolrCloud cluster using the startup scripts, you can add a new node to it by:

View File

@ -82,7 +82,7 @@ The default is `<em>`.
The default is `</em>`. The default is `</em>`.
`hl.encoder`:: `hl.encoder`::
If blank, the default, then the stored text will be returned without any escaping/encoding performed by the highlighter. If set to `html` then special HMTL/XML characters will be encoded (e.g. `&` becomes `\&amp;`). The pre/post snippet characters are never encoded. If blank, the default, then the stored text will be returned without any escaping/encoding performed by the highlighter. If set to `html` then special HMTL/XML characters will be encoded (e.g., `&` becomes `\&amp;`). The pre/post snippet characters are never encoded.
`hl.maxAnalyzedChars`:: `hl.maxAnalyzedChars`::
The character limit to look for highlights, after which no highlighting will be done. This is mostly only a performance concern for an _analysis_ based offset source since it's the slowest. See <<Schema Options and Performance Considerations>>. The character limit to look for highlights, after which no highlighting will be done. This is mostly only a performance concern for an _analysis_ based offset source since it's the slowest. See <<Schema Options and Performance Considerations>>.
@ -146,7 +146,7 @@ There are four highlighters available that can be chosen at runtime with the `hl
+ +
The Unified Highlighter is the newest highlighter (as of Solr 6.4), which stands out as the most flexible and performant of the options. We recommend that you try this highlighter even though it isn't the default (yet). The Unified Highlighter is the newest highlighter (as of Solr 6.4), which stands out as the most flexible and performant of the options. We recommend that you try this highlighter even though it isn't the default (yet).
+ +
This highlighter supports the most common highlighting parameters and can handle just about any query accurately, even SpanQueries (e.g. as seen from the `surround` parser). A strong benefit to this highlighter is that you can opt to configure Solr to put more information in the underlying index to speed up highlighting of large documents; multiple configurations are supported, even on a per-field basis. There is little or no such flexibility for the other highlighters. More on this below. This highlighter supports the most common highlighting parameters and can handle just about any query accurately, even SpanQueries (e.g., as seen from the `surround` parser). A strong benefit to this highlighter is that you can opt to configure Solr to put more information in the underlying index to speed up highlighting of large documents; multiple configurations are supported, even on a per-field basis. There is little or no such flexibility for the other highlighters. More on this below.
<<The Original Highlighter,Original Highlighter>>:: (`hl.method=original`, the default) <<The Original Highlighter,Original Highlighter>>:: (`hl.method=original`, the default)
+ +
@ -304,7 +304,7 @@ If `true`, multi-valued fields will return all values in the order they were sav
`hl.payloads`:: `hl.payloads`::
When `hl.usePhraseHighlighter` is `true` and the indexed field has payloads but not term vectors (generally quite rare), the index's payloads will be read into the highlighter's memory index along with the postings. When `hl.usePhraseHighlighter` is `true` and the indexed field has payloads but not term vectors (generally quite rare), the index's payloads will be read into the highlighter's memory index along with the postings.
+ +
If this may happen and you know you don't need them for highlighting (i.e. your queries don't filter by payload) then you can save a little memory by setting this to false. If this may happen and you know you don't need them for highlighting (i.e., your queries don't filter by payload) then you can save a little memory by setting this to false.
The Original Highlighter has a plugin architecture that enables new functionality to be registered in `solrconfig.xml`. The "```techproducts```" configset shows most of these settings explicitly. You can use it as a guide to provide your own components to include a `SolrFormatter`, `SolrEncoder`, and `SolrFragmenter.` The Original Highlighter has a plugin architecture that enables new functionality to be registered in `solrconfig.xml`. The "```techproducts```" configset shows most of these settings explicitly. You can use it as a guide to provide your own components to include a `SolrFormatter`, `SolrEncoder`, and `SolrFragmenter.`

View File

@ -101,7 +101,7 @@ Integer specifying how many backups to keep. This can be used to delete all but
The configuration files to replicate, separated by a comma. The configuration files to replicate, separated by a comma.
`commitReserveDuration`:: `commitReserveDuration`::
If your commits are very frequent and your network is slow, you can tweak this parameter to increase the amount of time expected to be required to transfer data. The default is `00:00:10` i.e. 10 seconds. If your commits are very frequent and your network is slow, you can tweak this parameter to increase the amount of time expected to be required to transfer data. The default is `00:00:10` i.e., 10 seconds.
The example below shows a possible 'master' configuration for the `ReplicationHandler`, including a fixed number of backups and an invariant setting for the `maxWriteMBPerSec` request parameter to prevent slaves from saturating its network interface The example below shows a possible 'master' configuration for the `ReplicationHandler`, including a fixed number of backups and an invariant setting for the `maxWriteMBPerSec` request parameter to prevent slaves from saturating its network interface

View File

@ -165,7 +165,7 @@ Expert options:
`caseFirst`:: Valid values are `lower` or `upper`. Useful to control which is sorted first when case is not ignored. `caseFirst`:: Valid values are `lower` or `upper`. Useful to control which is sorted first when case is not ignored.
`numeric`:: (true/false) If true, digits are sorted according to numeric value, e.g. foobar-9 sorts before foobar-10. The default is false. `numeric`:: (true/false) If true, digits are sorted according to numeric value, e.g., foobar-9 sorts before foobar-10. The default is false.
`variableTop`:: Single character or contraction. Controls what is variable for `alternate`. `variableTop`:: Single character or contraction. Controls what is variable for `alternate`.

View File

@ -46,7 +46,7 @@ A feature is a value, a number, that represents some quantity or quality of the
==== Normalizer ==== Normalizer
Some ranking models expect features on a particular scale. A normalizer can be used to translate arbitrary feature values into normalized values e.g. on a 0..1 or 0..100 scale. Some ranking models expect features on a particular scale. A normalizer can be used to translate arbitrary feature values into normalized values e.g., on a 0..1 or 0..100 scale.
=== Training Models === Training Models
@ -82,7 +82,7 @@ The ltr contrib module includes a <<transforming-result-documents.adoc#transform
==== Feature Selection and Model Training ==== Feature Selection and Model Training
Feature selection and model training take place offline and outside Solr. The ltr contrib module supports two generalized forms of models as well as custom models. Each model class's javadocs contain an example to illustrate configuration of that class. In the form of JSON files your trained model or models (e.g. different models for different customer geographies) can then be directly uploaded into Solr using provided REST APIs. Feature selection and model training take place offline and outside Solr. The ltr contrib module supports two generalized forms of models as well as custom models. Each model class's javadocs contain an example to illustrate configuration of that class. In the form of JSON files your trained model or models (e.g., different models for different customer geographies) can then be directly uploaded into Solr using provided REST APIs.
[cols=",,",options="header",] [cols=",,",options="header",]
|=== |===
@ -609,7 +609,7 @@ The feature store and the model store are both <<managed-resources.adoc#managed-
* Conventions used: * Conventions used:
** `<store>.json` file contains features for the `<store>` feature store ** `<store>.json` file contains features for the `<store>` feature store
** `<model>.json` file contains model name `<model>` ** `<model>.json` file contains model name `<model>`
** a 'generation' id (e.g. `YYYYMM` year-month) is part of the feature store and model names ** a 'generation' id (e.g., `YYYYMM` year-month) is part of the feature store and model names
** The model's features and weights are sorted alphabetically by name, this makes it easy to see what the commonalities and differences between the two models are. ** The model's features and weights are sorted alphabetically by name, this makes it easy to see what the commonalities and differences between the two models are.
** The stores features are sorted alphabetically by name, this makes it easy to see what the commonalities and differences between the two feature stores are. ** The stores features are sorted alphabetically by name, this makes it easy to see what the commonalities and differences between the two feature stores are.

View File

@ -299,10 +299,10 @@ and re-reported elsewhere as necessary.
In case of shard reporter the target node is the shard leader, in case of cluster reporter the In case of shard reporter the target node is the shard leader, in case of cluster reporter the
target node is the Overseer leader. target node is the Overseer leader.
=== Shard reporter === Shard Reporter
This reporter uses predefined `shard` group, and the implementing class must be (a subclass of) This reporter uses predefined `shard` group, and the implementing class must be (a subclass of)
`solr.SolrShardReporter`. It publishes selected metrics from replicas to the node where shard leader is `solr.SolrShardReporter`. It publishes selected metrics from replicas to the node where shard leader is
located. Reports use a target registry name that is the replica's registry name with a `.leader` suffix, eg. for a located. Reports use a target registry name that is the replica's registry name with a `.leader` suffix, e.g., for a
SolrCore name `collection1_shard1_replica_n3` the target registry name is SolrCore name `collection1_shard1_replica_n3` the target registry name is
`solr.core.collection1.shard1.replica_n3.leader`. `solr.core.collection1.shard1.replica_n3.leader`.
@ -334,7 +334,7 @@ Example configuration:
</reporter> </reporter>
---- ----
=== Cluster reporter === Cluster Reporter
This reporter uses predefined `cluster` group and the implementing class must be (a subclass of) This reporter uses predefined `cluster` group and the implementing class must be (a subclass of)
`solr.SolrClusterReporter`. It publishes selected metrics from any local registry to the Overseer leader node. `solr.SolrClusterReporter`. It publishes selected metrics from any local registry to the Overseer leader node.

View File

@ -61,7 +61,7 @@ One point of confusion is how much data is contained in a tlog. A tlog does not
WARNING: Implicit in the above is that transaction logs will grow forever if hard commits are disabled. Therefore it is important that hard commits be enabled when indexing. WARNING: Implicit in the above is that transaction logs will grow forever if hard commits are disabled. Therefore it is important that hard commits be enabled when indexing.
=== Configuring commits === Configuring Commits
As mentioned above, it is usually preferable to configure your commits (both hard and soft) in `solrconfig.xml` and avoid sending commits from an external source. Check your `solrconfig.xml` file since the defaults are likely not tuned to your needs. Here is an example NRT configuration for the two flavors of commit, a hard commit every 60 seconds and a soft commit every 30 seconds. Note that these are _not_ the values in some of the examples! As mentioned above, it is usually preferable to configure your commits (both hard and soft) in `solrconfig.xml` and avoid sending commits from an external source. Check your `solrconfig.xml` file since the defaults are likely not tuned to your needs. Here is an example NRT configuration for the two flavors of commit, a hard commit every 60 seconds and a soft commit every 30 seconds. Note that these are _not_ the values in some of the examples!

View File

@ -40,7 +40,7 @@ When modifying the schema with the API, a core reload will automatically occur i
.Re-index after schema modifications! .Re-index after schema modifications!
[IMPORTANT] [IMPORTANT]
==== ====
If you modify your schema, you will likely need to re-index all documents. If you do not, you may lose access to documents, or not be able to interpret them properly, e.g. after replacing a field type. If you modify your schema, you will likely need to re-index all documents. If you do not, you may lose access to documents, or not be able to interpret them properly, e.g., after replacing a field type.
Modifying your schema will never modify any documents that are already indexed. You must re-index documents in order to apply schema changes to them. Queries and updates made after the change may encounter errors that were not present before the change. Completely deleting the index and rebuilding it is usually the only option to fix such errors. Modifying your schema will never modify any documents that are already indexed. You must re-index documents in order to apply schema changes to them. Queries and updates made after the change may encounter errors that were not present before the change. Completely deleting the index and rebuilding it is usually the only option to fix such errors.
==== ====
@ -595,7 +595,7 @@ If neither the `fl` query parameter nor the `fieldname` path parameter is specif
If `false`, the default, matching dynamic fields will not be returned. If `false`, the default, matching dynamic fields will not be returned.
`showDefaults`:: `showDefaults`::
If `true`, all default field properties from each field's field type will be included in the response (e.g. `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included. If `true`, all default field properties from each field's field type will be included in the response (e.g., `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included.
==== List Fields Response ==== List Fields Response
@ -670,7 +670,7 @@ The query parameters can be added to the API request after a '?'.
Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default. Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default.
`showDefaults`:: `showDefaults`::
If `true`, all default field properties from each dynamic field's field type will be included in the response (e.g. `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included. If `true`, all default field properties from each dynamic field's field type will be included in the response (e.g., `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included.
==== List Dynamic Field Response ==== List Dynamic Field Response
@ -753,7 +753,7 @@ The query parameters can be added to the API request after a '?'.
Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default. Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default.
`showDefaults`:: `showDefaults`::
If `true`, all default field properties from each dynamic field's field type will be included in the response (e.g. `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included. If `true`, all default field properties from each dynamic field's field type will be included in the response (e.g., `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included.
==== List Field Type Response ==== List Field Type Response

View File

@ -463,7 +463,7 @@ Other collections can share the same configuration by specifying the name of the
==== Data-driven Schema and Shared Configurations ==== Data-driven Schema and Shared Configurations
The `_default` schema can mutate as data is indexed, since it has schemaless functionality (i.e. data-driven changes to the schema). Consequently, we recommend that you do not share data-driven configurations between collections unless you are certain that all collections should inherit the changes made when indexing data into one of the collections. You can turn off schemaless functionality (i.e. data-driven changes to the schema) for a collection by the following (assuming the collection name is `mycollection`): The `_default` schema can mutate as data is indexed, since it has schemaless functionality (i.e., data-driven changes to the schema). Consequently, we recommend that you do not share data-driven configurations between collections unless you are certain that all collections should inherit the changes made when indexing data into one of the collections. You can turn off schemaless functionality (i.e., data-driven changes to the schema) for a collection by the following (assuming the collection name is `mycollection`):
`curl http://host:8983/solr/mycollection/config -d '{"set-user-property": {"update.autoCreateFields":"false"}}'` `curl http://host:8983/solr/mycollection/config -d '{"set-user-property": {"update.autoCreateFields":"false"}}'`

View File

@ -24,7 +24,7 @@ The Autoscaling API is used to manage autoscaling policies, preferences, trigger
== Read API == Read API
The autoscaling Read API is available at `/admin/autoscaling` or `/solr/cluster/autoscaling`. It returns information about the configured cluster preferences, cluster policy, collection-specific policies triggers and listeners. The autoscaling Read API is available at `/solr/admin/autoscaling` or `/api/cluster/autoscaling` (v2 API style). It returns information about the configured cluster preferences, cluster policy, collection-specific policies triggers and listeners.
This API does not take any parameters. This API does not take any parameters.
@ -145,26 +145,27 @@ However, since the first node in the first example had more than 1 replica for a
In the above example the node with port 8983 has two replicas for `shard1` in violation of our policy. In the above example the node with port 8983 has two replicas for `shard1` in violation of our policy.
== History API == History API
History of autoscaling events is available at `/admin/autoscaling/history`. It returns information
The history of autoscaling events is available at `/admin/autoscaling/history`. It returns information
about past autoscaling events and details about their processing. This history is kept in about past autoscaling events and details about their processing. This history is kept in
the `.system` collection, and populated by a trigger listener `SystemLogListener` - by default this the `.system` collection, and is populated by a trigger listener `SystemLogListener`. By default this
listener is added to all new triggers. listener is added to all new triggers.
History events are regular Solr documents so they can be also accessed directly by History events are regular Solr documents so they can be also accessed directly by
searching on the `.system` collection. History handler acts as a regular search handler, so all searching on the `.system` collection. The history handler acts as a regular search handler, so all
query parameters supported by `/select` handler for that collection are supported here too. query parameters supported by `/select` handler for that collection are supported here too.
However, the history handler makes this However, the history handler makes this
process easier by offering a simpler syntax and knowledge of field names process easier by offering a simpler syntax and knowledge of field names
used by `SystemLogListener` for serialization of event data. used by `SystemLogListener` for serialization of event data.
History documents contain also the action context, if it was available, which gives History documents contain the action context, if it was available, which gives
further insight into e.g. exact operations that were computed and/or executed. further insight into e.g., exact operations that were computed and/or executed.
Specifically, the following query parameters can be used (they are turned into Specifically, the following query parameters can be used (they are turned into
filter queries, so an implicit AND is applied): filter queries, so an implicit AND is applied):
* `trigger` - trigger name * `trigger` - trigger name
* `eventType` - event type / trigger type (eg. `nodeAdded`) * `eventType` - event type / trigger type (e.g., `nodeAdded`)
* `collection` - collection name involved in event processing * `collection` - collection name involved in event processing
* `stage` - event processing stage * `stage` - event processing stage
* `action` - trigger action * `action` - trigger action
@ -236,18 +237,18 @@ filter queries, so an implicit AND is applied):
.Broken v2 API support .Broken v2 API support
[WARNING] [WARNING]
==== ====
Due to a bug in Solr 7.1.0, the History API is available only at the path /admin/autoscaling/history. Using the /api/cluster/autoscaling/history endpoint returns an error. Due to a bug in Solr 7.1.0, the History API is available only at the path `/admin/autoscaling/history`. Using the `/api/cluster/autoscaling/history` endpoint returns an error.
==== ====
== Write API == Write API
The Write API is available at the same `/admin/autoscaling` and `/api/cluster/autoscaling` endpoints as the read API but can only be used with the *POST* HTTP verb. The Write API is available at the same `/admin/autoscaling` and `/api/cluster/autoscaling` endpoints as the Read API but can only be used with the *POST* HTTP verb.
The payload of the POST request is a JSON message with commands to set and remove components. Multiple commands can be specified together in the payload. The commands are executed in the order specified and the changes are atomic, i.e., either all succeed or none. The payload of the POST request is a JSON message with commands to set and remove components. Multiple commands can be specified together in the payload. The commands are executed in the order specified and the changes are atomic, i.e., either all succeed or none.
=== Create and Modify Cluster Preferences === Create and Modify Cluster Preferences
Cluster preferences are specified as a list of sort preferences. Multiple sorting preferences can be specified and they are applied in order. Cluster preferences are specified as a list of sort preferences. Multiple sorting preferences can be specified and they are applied in the order they are set.
They are defined using the `set-cluster-preferences` command. They are defined using the `set-cluster-preferences` command.
@ -336,7 +337,7 @@ Refer to the <<solrcloud-autoscaling-policy-preferences.adoc#policy-specificatio
} }
---- ----
Output: *Output*:
[source,json] [source,json]
---- ----
{ {
@ -422,7 +423,7 @@ If you attempt to remove a policy that is being used by a collection, this comma
=== Create/Update Trigger === Create/Update Trigger
The set-trigger command can be used to create a new trigger or overwrite an existing one. The `set-trigger` command can be used to create a new trigger or overwrite an existing one.
You can see the section <<solrcloud-autoscaling-triggers.adoc#trigger-configuration,Trigger Configuration>> for a full list of configuration options. You can see the section <<solrcloud-autoscaling-triggers.adoc#trigger-configuration,Trigger Configuration>> for a full list of configuration options.
@ -464,7 +465,7 @@ You can see the section <<solrcloud-autoscaling-triggers.adoc#trigger-configurat
=== Remove Trigger === Remove Trigger
The remove-trigger command can be used to remove a trigger. It accepts a single parameter: the name of the trigger. The `remove-trigger` command can be used to remove a trigger. It accepts a single parameter: the name of the trigger.
.Removing the nodeLost Trigger .Removing the nodeLost Trigger
[source,json] [source,json]
@ -478,7 +479,7 @@ The remove-trigger command can be used to remove a trigger. It accepts a single
=== Create/Update Trigger Listener === Create/Update Trigger Listener
The set-listener command can be used to create or modify a listener for a trigger. The `set-listener` command can be used to create or modify a listener for a trigger.
You can see the section <<solrcloud-autoscaling-listeners.adoc#listener-configuration,Trigger Listener Configuration>> for a full list of configuration options. You can see the section <<solrcloud-autoscaling-listeners.adoc#listener-configuration,Trigger Listener Configuration>> for a full list of configuration options.
@ -497,7 +498,7 @@ You can see the section <<solrcloud-autoscaling-listeners.adoc#listener-configur
=== Remove Trigger Listener === Remove Trigger Listener
The remove-listener command can be used to remove an existing listener. It accepts a single parameter: the name of the listener. The `remove-listener` command can be used to remove an existing listener. It accepts a single parameter: the name of the listener.
.Removing the foo listener .Removing the foo listener
[source,json] [source,json]

View File

@ -21,28 +21,27 @@
Solr provides a way to automatically add replicas for a collection when the number of active replicas drops below Solr provides a way to automatically add replicas for a collection when the number of active replicas drops below
the replication factor specified at the time of the creation of the collection. the replication factor specified at the time of the creation of the collection.
== `autoAddReplicas` parameter == The autoAddReplicas Parameter
The boolean `autoAddReplicas` parameter can be passed to the Create Collection API to enable this feature for a given collection. The boolean `autoAddReplicas` parameter can be passed to the CREATE command of the Collection API to enable this feature for a given collection.
.Creating a collection with autoAddReplicas feature .Create a collection with autoAddReplicas enabled
`http://localhost:8983/solr/admin/collections?action=CREATE&name=my_collection&numShards=1&replicationFactor=5&autoAddReplicas=true` [source,text]
http://localhost:8983/solr/admin/collections?action=CREATE&name=my_collection&numShards=1&replicationFactor=5&autoAddReplicas=true
The modify collection API can be used to enable/disable this feature for any collection. The MODIFYCOLLECTION command can be used to enable/disable this feature for any collection.
.Modifying collection to disable autoAddReplicas feature .Modify collection to disable autoAddReplicas
`http://localhost:8983/solr/admin/collections?action=MODIFYCOLLECTION&name=my_collection&autoAddReplicas=false` [source,text]
http://localhost:8983/solr/admin/collections?action=MODIFYCOLLECTION&name=my_collection&autoAddReplicas=false
== Implementation using `.autoAddReplicas` trigger == Implementation Using .autoAddReplicas Trigger
A Trigger named `.autoAddReplicas` is automatically created whenever any collection has the autoAddReplicas feature enabled. A Trigger named `.autoAddReplicas` is automatically created whenever any collection has the autoAddReplicas feature enabled.
Only one trigger is sufficient to server all collections having this feature enabled. The `.autoAddReplicas` trigger watches
for nodes that are lost from the cluster and uses the default TriggerActions to create new replicas to replace the ones
which were hosted by the lost node. If the old node comes back online, it unloads the moved replicas and is free to host other
replicas as and when required.
Since the trigger provides the autoAddReplicas feature for all collections, the suspend-trigger and resume-trigger APIs Only one trigger is sufficient to serve all collections having this feature enabled. The `.autoAddReplicas` trigger watches for nodes that are lost from the cluster and uses the default `TriggerActions` to create new replicas to replace the ones which were hosted by the lost node. If the old node comes back online, it unloads the moved replicas and the node is free to host other replicas as and when required.
can be used to disable and enable this feature for all collections in one API call.
Since the trigger provides the autoAddReplicas feature for all collections, the `suspend-trigger` and `resume-trigger` Autoscaling API commands can be used to disable and enable this feature for all collections in one API call. See <<solrcloud-autoscaling-auto-add-replicas.adoc#>>
.Suspending autoAddReplicas for all collections .Suspending autoAddReplicas for all collections
[source,json] [source,json]
@ -64,13 +63,13 @@ can be used to disable and enable this feature for all collections in one API ca
} }
---- ----
== Using cluster property to enable/disable autoAddReplicas == Using Cluster Property to Enable autoAddReplicas
A cluster property, also named `autoAddReplicas`, can be set to `false` to disable this feature for all collections. A cluster property, also named `autoAddReplicas`, can be set to `false` to disable this feature for all collections.
If this cluster property is missing or set to `true` the autoAddReplicas is enabled for all collections. If this cluster property is missing or set to `true`, the autoAddReplicas is enabled for all collections.
.Deprecation Warning .Deprecation Warning
[WARNING] [WARNING]
==== ====
Using cluster property to enable/disable autoAddReplicas is deprecated and only supported for back compatibility. Please use the suspend-trigger and resume-trigger APIs instead. Using a cluster property to enable or disable autoAddReplicas is deprecated and only supported for back compatibility. Please use the `suspend-trigger` and `resume-trigger` API commands instead.
==== ====

View File

@ -18,43 +18,44 @@
// specific language governing permissions and limitations // specific language governing permissions and limitations
// under the License. // under the License.
== Node added / lost markers The autoscaling framework uses a few strategies to ensure it's able to still trigger actions in the event of unexpected changes to the system.
Since triggers execute on the node that runs Overseer, should this node go down the `nodeLost`
== Node Added or Lost Markers
Since triggers execute on the node that runs the Overseer, should the Overseer node go down the `nodeLost`
event would be lost because there would be no mechanism to generate it. Similarly, if a node has event would be lost because there would be no mechanism to generate it. Similarly, if a node has
been added between the Overseer leader change was completed the `nodeAdded` event would not be been added before the Overseer leader change was completed, the `nodeAdded` event would not be
generated. generated.
For this reason Solr implements additional mechanisms to ensure that these events are generated For this reason Solr implements additional mechanisms to ensure that these events are generated
reliably. reliably.
When a node joins a cluster its presence is marked as an ephemeral ZK path in the `/live_nodes/<nodeName>` With standard SolrCloud behavior, when a node joins a cluster its presence is marked as an ephemeral ZooKeeper path in the `/live_nodes/<nodeName>` ZooKeeper directory. Now an ephemeral path is also created under `/autoscaling/nodeAdded/<nodeName>`.
ZooKeeper directory, but now also an ephemeral path is created under `/autoscaling/nodeAdded/<nodeName>`.
When a new instance of Overseer leader is started it will run the `nodeAdded` trigger (if it's configured) When a new instance of Overseer leader is started it will run the `nodeAdded` trigger (if it's configured)
and discover the presence of this ZK path, at which point it will remove it and generate a `nodeAdded` event. and discover the presence of this ZooKeeper path, at which point it will remove it and generate a `nodeAdded` event.
When a node leaves the cluster up to three remaining nodes will try to create a persistent ZK path When a node leaves the cluster, up to three remaining nodes will try to create a persistent ZooKeeper path
`/autoscaling/nodeLost/<nodeName>` and eventually one of them succeeds. When a new instance of Overseer leader `/autoscaling/nodeLost/<nodeName>` and eventually one of them succeeds. When a new instance of Overseer leader
is started it will run the `nodeLost` trigger (if it's configured) and discover the presence of this ZK is started it will run the `nodeLost` trigger (if it's configured) and discover the presence of this ZooKeeper
path, at which point it will remove it and generate a `nodeLost` event. path, at which point it will remove it and generate a `nodeLost` event.
== Trigger state checkpointing == Trigger State Checkpointing
Triggers generate events based on their internal state. If Overseer leader goes down while the trigger is Triggers generate events based on their internal state. If the Overseer leader goes down while the trigger is
about to generate a new event, it's likely that the event would be lost because a new trigger instance about to generate a new event, it's likely that the event would be lost because a new trigger instance
running on the new Overseer leader would start from a clean slate. running on the new Overseer leader would start from a clean slate.
For this reason after each time a trigger is executed its internal state is persisted to ZooKeeper, and For this reason, after each time a trigger is executed its internal state is persisted to ZooKeeper, and
on Overseer start its internal state is restored. on Overseer start its internal state is restored.
== Trigger event queues == Trigger Event Queues
Autoscaling framework limits the rate at which events are processed using several different mechanisms. Autoscaling framework limits the rate at which events are processed using several different mechanisms.
One is the locking mechanism that prevents concurrent One is the locking mechanism that prevents concurrent
processing of events, and another is a single-threaded executor that runs trigger actions. processing of events, and another is a single-threaded executor that runs trigger actions.
This means that the processing of an event may take significant time, and during this time it's possible that This means that the processing of an event may take significant time, and during this time it's possible that the
Overseer may go down. In order to avoid losing events that were already generated but not yet fully Overseer may go down. In order to avoid losing events that were already generated but not yet fully
processed events are queued before processing is started. processed, events are queued before processing is started.
Separate ZooKeeper queues are created for each trigger, and events produced by triggers are put on these Separate ZooKeeper queues are created for each trigger, and events produced by triggers are put on these
per-trigger queues. When a new Overseer leader is started it will first check per-trigger queues. When a new Overseer leader is started it will first check
these queues and process events accumulated there, and only then it will continue to run triggers these queues and process events accumulated there, and only then it will continue to run triggers
normally. Queued events that fail processing during this "replay" stage are discarded. normally. Queued events that fail processing during this "replay" stage are discarded.

View File

@ -18,19 +18,18 @@
// specific language governing permissions and limitations // specific language governing permissions and limitations
// under the License. // under the License.
Trigger listener API allows users to provide additional behavior related to trigger events as they are being processed. Trigger Listeners allow users to configure additional behavior related to trigger events as they are being processed.
For example, users may want to record autoscaling events to an external system, or notify administrator when a For example, users may want to record autoscaling events to an external system, or notify an administrator when a
particular type of event occurs, or when its processing reaches certain stage (eg. failed). particular type of event occurs or when its processing reaches certain stage (e.g., failed).
Listener configuration always refers to a specific trigger configuration - listener is notified of Listener configuration always refers to a specific trigger configuration because a listener is notified of
events generated by that specific trigger. Several (or none) named listeners can be registered for a trigger, events generated by that specific trigger. Several (or none) named listeners can be registered for a trigger,
and they will be notified in the order in which they were defined. and they will be notified in the order in which they were defined.
Listener configuration can specify what processing stages are of interest - when an event enters this processing stage Listener configuration can specify what processing stages are of interest, and when an event enters this processing stage the listener will be notified. Currently the following stages are recognized:
the listener will be notified. Currently the following stages are recognized:
* STARTED - when event has been generated by a trigger and its processing is starting. * STARTED - when an event has been generated by a trigger and its processing is starting.
* ABORTED - when event was being processed while the source trigger closed. * ABORTED - when event was being processed while the source trigger closed.
* BEFORE_ACTION - when a `TriggerAction` is about to be invoked. Action name and the current `ActionContext` are passed to the listener. * BEFORE_ACTION - when a `TriggerAction` is about to be invoked. Action name and the current `ActionContext` are passed to the listener.
* AFTER_ACTION - after a `TriggerAction` has been successfully invoked. Action name, `ActionContext` and the list of action * AFTER_ACTION - after a `TriggerAction` has been successfully invoked. Action name, `ActionContext` and the list of action
@ -38,28 +37,27 @@ the listener will be notified. Currently the following stages are recognized:
* FAILED - when event processing failed (or when a `TriggerAction` failed) * FAILED - when event processing failed (or when a `TriggerAction` failed)
* SUCCEEDED - when event processing completes successfully * SUCCEEDED - when event processing completes successfully
Listener configuration can also specify what particular actions are of interest, both Listener configuration can also specify what particular actions are of interest, both before and/or after they are invoked.
before and/or after they are invoked.
== Listener configuration == Listener Configuration
Currently the following listener configuration properties are supported: Currently the following listener configuration properties are supported:
* `name` - (string, required) unique listener configuration name. * `name` - (string, required) A unique listener configuration name.
* `trigger` - (string, required) name of an existing trigger configuration. * `trigger` - (string, required) The name of an existing trigger configuration.
* `class` - (string, required) listener implementation class name. * `class` - (string, required) A listener implementation class name.
* `stage` - (list of strings, optional, ignored case) list of processing stages that * `stage` - (list of strings, optional, ignored case) A list of processing stages that
this listener should be notified. Default is empty list. this listener should be notified. Default is empty list.
* `beforeAction` - (list of strings, optional) list of action names (as defined in trigger configuration) before * `beforeAction` - (list of strings, optional) A list of action names (as defined in trigger configuration) before
which the listener will be notified. Default is empty list. which the listener will be notified. Default is empty list.
* `afterAction` - (list of strings, optional) list of action names after which the listener will be notified. * `afterAction` - (list of strings, optional) A list of action names after which the listener will be notified.
Default is empty list. Default is empty list.
* additional implementation-specific properties may be provided. * Additional implementation-specific properties may be provided.
Note: when both `stage` and `beforeAction` / `afterAction` lists are non-empty then the listener will be notified both Note: when both `stage` and `beforeAction` / `afterAction` lists are non-empty then the listener will be notified both
when a specified stage is entered and before / after specified actions. when a specified stage is entered and before / after specified actions.
=== Managing listener configurations === Managing Listener Configurations
Listener configurations can be managed using autoscaling Write API, and using `set-listener` and `remove-listener` Listener configurations can be managed using the Autoscaling Write API, and using `set-listener` and `remove-listener`
commands. commands.
For example: For example:
@ -85,17 +83,15 @@ For example:
} }
---- ----
== Listener Implementations
Trigger listeners must implement the `TriggerListener` interface. Solr provides some
implementations of trigger listeners, which cover common use cases. These implementations are described below, together with their configuration parameters.
== Listener implementations === SystemLogListener
Trigger listeners must implement `TriggerListener` interface. Solr provides some
implementations of trigger listeners, which cover common use cases. These implementations are described in sections
below, together with their configuration parameters.
=== `SystemLogListener`
This trigger listener sends trigger events and processing context as documents for indexing in This trigger listener sends trigger events and processing context as documents for indexing in
SolrCloud `.system` collection. SolrCloud `.system` collection.
When a trigger configuration is first created a corresponding trigger listener configuration that When a trigger configuration is first created, a corresponding trigger listener configuration that
uses `SystemLogListener` is also automatically created, to make sure that all events and uses `SystemLogListener` is also automatically created, to make sure that all events and
actions related to the autoscaling framework are logged to the `.system` collection. actions related to the autoscaling framework are logged to the `.system` collection.
@ -121,7 +117,6 @@ Documents created by this listener have several predefined fields:
* `event_str` - JSON representation of all event properties * `event_str` - JSON representation of all event properties
* `context_str` - JSON representation of all `ActionContext` properties, if available * `context_str` - JSON representation of all `ActionContext` properties, if available
The following fields are created using the information from trigger event: The following fields are created using the information from trigger event:
* `event.id_s` - event id * `event.id_s` - event id
@ -145,23 +140,22 @@ trigger named `foo`):
} }
---- ----
=== `HttpTriggerListener` === HttpTriggerListener
This listener uses HTTP POST to send a representation of event and context to a specified URL. This listener uses HTTP POST to send a representation of the event and context to a specified URL.
URL, payload and headers may contain property substitution patterns, which are then replaced with values takes from the The URL, payload, and headers may contain property substitution patterns, which are then replaced with values taken from the current event or context properties.
current event or context properties.
Templates use the same syntax as property substitution in Solr configuration files, eg. Templates use the same syntax as property substitution in Solr configuration files, e.g.,
`${foo.bar:baz}` means that the value of `foo.bar` property should be taken, and `baz` should be used `${foo.bar:baz}` means that the value of `foo.bar` property should be taken, and `baz` should be used
if the value is absent. if the value is absent.
Supported configuration properties: Supported configuration properties:
* `url` - (string, required) a URL template * `url` - (string, required) A URL template.
* `payload` - (string, optional) payload template. If absent a JSON map of all properties listed above will be used. * `payload` - (string, optional) A payload template. If absent, a JSON map of all properties listed above will be used.
* `contentType` - (string, optional) payload content type. If absent then application/json will be used. * `contentType` - (string, optional) A payload content type. If absent then `application/json` will be used.
* `header.*` - (string, optional) header template(s). The name of the property without "header." prefix defines the literal header name. * `header.*` - (string, optional) A header template(s). The name of the property without "header." prefix defines the literal header name.
* `timeout` - (int, optional) connection and socket timeout in milliseconds. Default is 60 seconds. * `timeout` - (int, optional) Connection and socket timeout in milliseconds. Default is `60000` milliseconds (60 seconds).
* `followRedirects` - (boolean, optional) setting to follow redirects. Default is false. * `followRedirects` - (boolean, optional) Allows following redirects. Default is `false`.
The following properties are available in context and can be referenced from templates: The following properties are available in context and can be referenced from templates:
@ -173,7 +167,7 @@ The following properties are available in context and can be referenced from tem
* `error` - optional error string (from Throwable.toString()) * `error` - optional error string (from Throwable.toString())
* `message` - optional message * `message` - optional message
Example configuration: .Example HttpTriggerListener
[source,json] [source,json]
---- ----
{ {
@ -185,12 +179,12 @@ Example configuration:
"header.X-Trigger": "${config.trigger}", "header.X-Trigger": "${config.trigger}",
"payload": "actionName=${actionName}, source=${event.source}, type=${event.eventType}", "payload": "actionName=${actionName}, source=${event.source}, type=${event.eventType}",
"contentType": "text/plain", "contentType": "text/plain",
"stage": ["STARTED", "ABORTED", SUCCEEDED", "FAILED"], "stage": ["STARTED", "ABORTED", "SUCCEEDED", "FAILED"],
"beforeAction": ["compute_plan", "execute_plan"], "beforeAction": ["compute_plan", "execute_plan"],
"afterAction": ["compute_plan", "execute_plan"] "afterAction": ["compute_plan", "execute_plan"]
} }
---- ----
This configuration specifies that each time one of the listed stages is reached, or before and after each of the listed This configuration specifies that each time one of the listed stages is reached, or before and after each of the listed
actions is executed, the listener will send the templated payload to a URL that also depends on the config and the current event, actions is executed, the listener will send the templated payload to a URL that also depends on the config and the current event,
and with a custom header that indicates the trigger name. and with a custom header that indicates the trigger name.

View File

@ -24,56 +24,57 @@ Autoscaling in Solr aims to provide good defaults so a SolrCloud cluster remains
A simple example is automatically adding a replica for a SolrCloud collection when a node containing an existing replica goes down. A simple example is automatically adding a replica for a SolrCloud collection when a node containing an existing replica goes down.
The goal of autoscaling feature is to make SolrCloud cluster management easier, automatic and intelligent. It aims to provide good defaults such that the cluster remains balanced and stable in the face of various events such as a node joining the cluster or leaving the cluster. This is achieved by satisfying a set of rules and sorting preferences that help Solr select the target of cluster management operations. The goal of autoscaling in SolrCloud is to make cluster management easier, more automatic, and more intelligent. It aims to provide good defaults such that the cluster remains balanced and stable in the face of various events such as a node joining the cluster or leaving the cluster. This is achieved by satisfying a set of rules and sorting preferences that help Solr select the target of cluster management operations.
There are three distinct problems that this feature solves: There are three distinct problems that this feature solves:
* When to run cluster management tasks? e.g. we might want to add a replica when an existing replica is no longer alive. * When to run cluster management tasks? For example, we might want to add a replica when an existing replica is no longer alive.
* Which cluster management task to run? e.g. do we add a new replica or should we move an existing one to a new node * Which cluster management task to run? For example, do we add a new replica or should we move an existing one to a new node?
* How to run the cluster management tasks such that the cluster remains balanced and stable? * How do we run the cluster management tasks so the cluster remains balanced and stable?
Before we get into the details of how each of these problems are solved, let's take a quick look at the easiest way to setup autoscaling for your cluster. Before we get into the details of how each of these problems are solved, let's take a quick look at the easiest way to setup autoscaling for your cluster.
== QuickStart: Automatically adding replicas == Quick Start: Automatically Adding Replicas
Say that we want to create a collection which always requires us to have three replicas available for each shard all the time. We can set the replicationFactor=3 while creating the collection but what happens if a node containing one or more of the replicas either crashed or was shutdown for maintenance. In such a case, we'd like to create additional replicas to replace the ones that are no longer available to preserve the original number of replicas. Say that we want to create a collection which always requires us to have three replicas available for each shard all the time. We can set the `replicationFactor=3` while creating the collection, but what happens if a node containing one or more of the replicas either crashed or was shutdown for maintenance? In such a case, we'd like to create additional replicas to replace the ones that are no longer available to preserve the original number of replicas.
We have an easy way to enable this behavior without needing to understand the autoscaling feature in depth. We can create a collection with such behavior by adding an additional parameter `autoAddReplicas=true` to the create collection API. For example: We have an easy way to enable this behavior without needing to understand the autoscaling features in depth. We can create a collection with such behavior by adding an additional parameter `autoAddReplicas=true` with the CREATE command of the Collection API. For example:
`/admin/collections?action=CREATE&name=_name_of_collection_&numShards=1&replicationFactor=3&autoAddReplicas=true` [source,text]
/admin/collections?action=CREATE&name=_name_of_collection_&numShards=1&replicationFactor=3&autoAddReplicas=true
A collection created with `autoAddReplicas=true` will be monitored by Solr such that if a node containing a replica of this collection goes down, Solr will add new replicas on other nodes after waiting for up to thirty seconds for the node to come back. A collection created with `autoAddReplicas=true` will be monitored by Solr such that if a node containing a replica of this collection goes down, Solr will add new replicas on other nodes after waiting for up to thirty seconds for the node to come back.
You can see the section <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas,SolrCloud AutoScaling Automatically Adding Replicas>> to learn more about how to enable or disable this feature as well as other details. You can see the section <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas, Autoscaling Automatically Adding Replicas>> to learn more about how to enable or disable this feature as well as other details.
The selection of the node that will host the new replica is made according to the default cluster preferences that we will learn more about in the next sections. The selection of the node that will host the new replica is made according to the default cluster preferences that we will learn more about in the next sections.
== Cluster Preferences == Cluster Preferences
Cluster preferences, as the name suggests, apply to all cluster management operations regardless of which collection they affect. Cluster preferences, as the name suggests, apply to all cluster management operations regardless of which collection they affect.
A preference is a set of conditions that help Solr select nodes that either maximize or minimize given metrics. For example, a preference such as `{minimize:cores}` will help Solr select nodes such that the number of cores on each node is minimized. We write cluster preferences in a way that reduces the overall load on the system. You can add more than one preferences to break ties. A preference is a set of conditions that help Solr select nodes that either maximize or minimize given metrics. For example, a preference such as `{minimize:cores}` will help Solr select nodes such that the number of cores on each node is minimized. We write cluster preferences in a way that reduces the overall load on the system. You can add more than one preferences to break ties.
The default cluster preferences consist of the above example (`{minimize : cores}`) which is to minimize the number of cores on all nodes. The default cluster preferences consist of the above example (`{minimize:cores}`) which is to minimize the number of cores on all nodes.
You can learn more about preferences in the <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,Autoscaling Cluster Preferences>> section. You can learn more about preferences in the <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,Autoscaling Cluster Preferences>> section.
== Cluster Policy == Cluster Policy
A cluster policy is a set of conditions that a node, shard, or collection must satisfy before it can be chosen as the target of a cluster management operation. These conditions are applied across the cluster regardless of the collection being managed. For example, the condition `{"cores":"<10", "node":"#ANY"}` means that any node must have less than 10 Solr cores in total regardless of which collection they belong to. A cluster policy is a set of conditions that a node, shard, or collection must satisfy before it can be chosen as the target of a cluster management operation. These conditions are applied across the cluster regardless of the collection being managed. For example, the condition `{"cores":"<10", "node":"#ANY"}` means that any node must have less than 10 Solr cores in total, regardless of which collection they belong to.
There are many metrics on which the condition can be based, e.g., system load average, heap usage, free disk space, etc. The full list of supported metrics can be found in the section describing <<solrcloud-autoscaling-policy-preferences.adoc#policy-attributes,Policy Attributes>>. There are many metrics on which the condition can be based, e.g., system load average, heap usage, free disk space, etc. The full list of supported metrics can be found in the section describing <<solrcloud-autoscaling-policy-preferences.adoc#policy-attributes,Autoscaling Policy Attributes>>.
When a node, shard, or collection does not satisfy the policy, we call it a *violation*. Solr ensures that cluster management operations minimize the number of violations. Cluster management operations are currently invoked manually. In the future, these cluster management operations may be invoked automatically in response to cluster events such as a node being added or lost. When a node, shard, or collection does not satisfy the policy, we call it a *violation*. Solr ensures that cluster management operations minimize the number of violations. Cluster management operations are currently invoked manually. In the future, these cluster management operations may be invoked automatically in response to cluster events such as a node being added or lost.
== Collection-Specific Policies == Collection-Specific Policies
A collection may need conditions in addition to those specified in the cluster policy. In such cases, we can create named policies that can be used for specific collections. Firstly, we can use the `set-policy` API to create a new policy and then specify the `policy=<policy_name>` parameter to the CREATE command of the Collection API. A collection may need conditions in addition to those specified in the cluster policy. In such cases, we can create named policies that can be used for specific collections. Firstly, we can use the `set-policy` API to create a new policy and then specify the `policy=<policy_name>` parameter to the CREATE command of the Collection API:
`/admin/collections?action=CREATE&name=coll1&numShards=1&replicationFactor=2&policy=policy1` [source,text]
/admin/collections?action=CREATE&name=coll1&numShards=1&replicationFactor=2&policy=policy1
The above create collection command will associate a policy named `policy1` with the collection named `coll1`. Only a single policy may be associated with a collection. The above CREATE collection command will associate a policy named `policy1` with the collection named `coll1`. Only a single policy may be associated with a collection.
Note that the collection-specific policy is applied *in addition to* the cluster policy, i.e., it is not an override but an augmentation. Therefore the collection will follow all conditions laid out in the cluster preferences, cluster policy, and the policy named `policy1`. Note that the collection-specific policy is applied *in addition to* the cluster policy, i.e., it is not an override but an augmentation. Therefore the collection will follow all conditions laid out in the cluster preferences, cluster policy, and the policy named `policy1`.
@ -81,9 +82,11 @@ You can learn more about collection-specific policies in the section <<solrclou
== Triggers == Triggers
Now that we have an idea about how cluster management operations use policy and preferences help Solr keep the cluster balanced and stable, we can talk about when to invoke such operations. Triggers are used to watch for events such as a node joining or leaving the cluster. When the event happens, the trigger executes a set of `actions` that compute and execute a *plan* i.e. a set of operations to change the cluster so that the policy and preferences are respected. Now that we have an idea about how cluster management operations use policies and preferences help Solr keep the cluster balanced and stable, we can talk about when to invoke such operations.
The `autoAddReplicas` parameter passed to the create collection API in the quickstart section automatically creates a trigger that watches for a node going away. When the trigger fires, it executes a set of actions that compute and execute a plan to move all replicas hosted by the lost node to new nodes in the cluster. The target nodes are chosen based on the policy and preferences. Triggers are used to watch for events such as a node joining or leaving the cluster. When the event happens, the trigger executes a set of actions that compute and execute a *plan*, i.e., a set of operations to change the cluster so that the policy and preferences are respected.
The `autoAddReplicas` parameter passed with the CREATE Collection API command in the <<Quick Start: Automatically Adding Replicas,Quick Start>> section above automatically creates a trigger that watches for a node going away. When the trigger fires, it executes a set of actions that compute and execute a plan to move all replicas hosted by the lost node to new nodes in the cluster. The target nodes are chosen based on the policy and preferences.
You can learn more about Triggers in the section <<solrcloud-autoscaling-triggers.adoc#solrcloud-autoscaling-triggers,Autoscaling Triggers>>. You can learn more about Triggers in the section <<solrcloud-autoscaling-triggers.adoc#solrcloud-autoscaling-triggers,Autoscaling Triggers>>.
@ -95,9 +98,9 @@ You can learn more about Trigger Actions in the section <<solrcloud-autoscaling-
== Listeners == Listeners
An AutoScaling *Listener* can be attached to a trigger. Solr calls the listener each time the trigger fires as well as before and after the actions performed by the trigger. Listeners are useful as a call back mechanism to perform tasks such as logging or informing external systems about events. For example, a listener is automatically added by Solr to each Trigger to log details of the trigger fire and actions to the `.system` collection. An Autoscaling *listener* can be attached to a trigger. Solr calls the listener each time the trigger fires as well as before and after the actions performed by the trigger. Listeners are useful as a call back mechanism to perform tasks such as logging or informing external systems about events. For example, a listener is automatically added by Solr to each trigger to log details of the trigger fire and actions to the `.system` collection.
You can learn more about Listeners in the section <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,AutoScaling Listeners>>. You can learn more about Listeners in the section <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,Autoscaling Listeners>>.
== Autoscaling APIs == Autoscaling APIs

View File

@ -26,17 +26,17 @@ The autoscaling policy and preferences are a set of rules and sorting preference
A preference is a hint to Solr on how to sort nodes based on their utilization. The default cluster preference is to sort by the total number of Solr cores (or replicas) hosted by a node. Therefore, by default, when selecting a node to add a replica, Solr can apply the preferences and choose the node with the least number of cores. A preference is a hint to Solr on how to sort nodes based on their utilization. The default cluster preference is to sort by the total number of Solr cores (or replicas) hosted by a node. Therefore, by default, when selecting a node to add a replica, Solr can apply the preferences and choose the node with the least number of cores.
More than one preferences can be added to break ties. For example, we may choose to use free disk space to break ties if the number of cores on two nodes are the same so the node with the higher free disk space can be chosen as the target of the cluster operation. More than one preference can be added to break ties. For example, we may choose to use free disk space to break ties if the number of cores on two nodes are the same. The node with the higher free disk space can be chosen as the target of the cluster operation.
Each preference is of the following form: Each preference takes the following form:
[source,json] [source,json]
{"<sort_order>":"<sort_param>", "precision":"<precision_val>"} {"<sort_order>":"<sort_param>", "precision":"<precision_val>"}
`sort_order`:: `sort_order`::
The value can be either `maximize` or `minimize`. `minimize` sorts the nodes with least value as the least loaded. For example, `{"minimize":"cores"}` sorts the nodes with the least number of cores as the least loaded node. A sort order such as `{"maximize":"freedisk"}` sorts the nodes with maximum free disk space as the least loaded node. The value can be either `maximize` or `minimize`. Choose `minimize` to sort the nodes with least value as the least loaded. For example, `{"minimize":"cores"}` sorts the nodes with the least number of cores as the least loaded node. A sort order such as `{"maximize":"freedisk"}` sorts the nodes with maximum free disk space as the least loaded node.
+ +
The objective of the system is to make every node the least loaded. So, in case of a `MOVEREPLICA` operation, it usually targets the _most loaded_ node and takes load off of it. In a sort of more loaded to less loaded, `minimize` is akin to sort in descending order and `maximize` is akin to sorting in ascending order. The objective of the system is to make every node the least loaded. So, in case of a `MOVEREPLICA` operation, it usually targets the _most loaded_ node and takes load off of it. In a sort of more loaded to less loaded, `minimize` is akin to sorting in descending order and `maximize` is akin to sorting in ascending order.
+ +
This is a required parameter. This is a required parameter.
@ -53,7 +53,7 @@ Precision tells the system the minimum (absolute) difference between 2 values to
+ +
For example, a precision of 10 for `freedisk` means that two nodes whose free disk space is within 10GB of each other should be treated as equal for the purpose of sorting. This helps create ties without which specifying multiple preferences is not useful. This is an optional parameter whose value must be a positive integer. The maximum value of `precision` must be less than the maximum value of the `sort_value`, if any. For example, a precision of 10 for `freedisk` means that two nodes whose free disk space is within 10GB of each other should be treated as equal for the purpose of sorting. This helps create ties without which specifying multiple preferences is not useful. This is an optional parameter whose value must be a positive integer. The maximum value of `precision` must be less than the maximum value of the `sort_value`, if any.
See the section <<solrcloud-autoscaling-api.adoc#create-and-modify-cluster-preferences,set-cluster-preferences API>> for details on how to manage cluster preferences. See the section <<solrcloud-autoscaling-api.adoc#create-and-modify-cluster-preferences,Create and Modify Cluster Preferences>> for details on how to manage cluster preferences with the API.
=== Examples of Cluster Preferences === Examples of Cluster Preferences

View File

@ -23,14 +23,14 @@ health and good use of resources.
Currently two implementations are provided: `ComputePlanAction` and `ExecutePlanAction`. Currently two implementations are provided: `ComputePlanAction` and `ExecutePlanAction`.
== Compute plan action == Compute Plan Action
The `ComputePlanAction` uses the policy and preferences to calculate the optimal set of Collection API The `ComputePlanAction` uses the policy and preferences to calculate the optimal set of Collection API
commands which can re-balance the cluster in response to trigger events. commands which can re-balance the cluster in response to trigger events.
Currently, it has no configurable parameters. Currently, it has no configurable parameters.
== Execute plan action == Execute Plan Action
The `ExecutePlanAction` executes the Collection API commands emitted by the `ComputePlanAction` against The `ExecutePlanAction` executes the Collection API commands emitted by the `ComputePlanAction` against
the cluster using SolrJ. It executes the commands serially, waiting for each of them to succeed before the cluster using SolrJ. It executes the commands serially, waiting for each of them to succeed before
@ -38,9 +38,9 @@ continuing with the next one.
Currently, it has no configurable parameters. Currently, it has no configurable parameters.
If any one of the command fails, then the complete chain of actions are If any one of the commands fail, then the complete chain of actions are
executed again at the next run of the trigger. If the Overseer node fails while ExecutePlanAction is running executed again at the next run of the trigger. If the Overseer node fails while `ExecutePlanAction` is running,
then the new overseer node will run the chain of actions for the same event again after waiting for any then the new Overseer node will run the chain of actions for the same event again after waiting for any
running Collection API operations belonging to the event to complete. running Collection API operations belonging to the event to complete.
Please see <<solrcloud-autoscaling-fault-tolerance.adoc#solrcloud-autoscaling-fault-tolerance,SolrCloud AutoScaling Fault Tolerance>> for more details on fault tolerance within the Autoscaling framework. Please see <<solrcloud-autoscaling-fault-tolerance.adoc#solrcloud-autoscaling-fault-tolerance,SolrCloud Autoscaling Fault Tolerance>> for more details on fault tolerance within the autoscaling framework.

View File

@ -18,19 +18,20 @@
// specific language governing permissions and limitations // specific language governing permissions and limitations
// under the License. // under the License.
Triggers are used by autoscaling API to watch for cluster events such as node joining or leaving, Triggers are used in autoscaling to watch for cluster events such as nodes joining or leaving.
and in the future also for other cluster, node and replica events that are important from the
point of view of cluster performance. In the future other cluster, node, and replica events that are important from the
point of view of cluster performance will also have available triggers.
Trigger implementations verify the state of resources that they monitor. When they detect a Trigger implementations verify the state of resources that they monitor. When they detect a
change that merits attention they generate events, which are then queued and processed by configured change that merits attention they generate events, which are then queued and processed by configured
`TriggerAction` implementations - this usually involves computing and executing a plan to manage the new cluster `TriggerAction` implementations. This usually involves computing and executing a plan to manage the new cluster
resources (eg. move replicas). Solr provides predefined implementations of triggers for specific event types. resources (e.g., move replicas). Solr provides predefined implementations of triggers for specific event types.
Triggers execute on the node that runs `Overseer`. They are scheduled to run periodically, Triggers execute on the node that runs `Overseer`. They are scheduled to run periodically,
currently at fixed interval of 1s between each execution (not every execution produces events). currently at fixed interval of 1 second between each execution (not every execution produces events).
== Event types == Event Types
Currently the following event types (and corresponding trigger implementations) are defined: Currently the following event types (and corresponding trigger implementations) are defined:
* `nodeAdded` - generated when a new node joins the cluster * `nodeAdded` - generated when a new node joins the cluster
@ -41,49 +42,45 @@ maximum rate of events is controlled by the `waitFor` configuration parameter (s
The following properties are common to all event types: The following properties are common to all event types:
* `id` - (string) unique time-based event id. * `id` - (string) A unique time-based event id.
* `eventType` - (string) event type. * `eventType` - (string) The type of event.
* `source` - (string) name of the trigger that produced this event. * `source` - (string) The name of the trigger that produced this event.
* `eventTime` - (long) Unix time when the condition that caused this event occurred. For example, for * `eventTime` - (long) Unix time when the condition that caused this event occurred. For example, for a
`nodeAdded` event this will be the time when the node was added and not when the event was actually `nodeAdded` event this will be the time when the node was added and not when the event was actually
generated, which may significantly differ due to the rate limits set by `waitFor`. generated, which may significantly differ due to the rate limits set by `waitFor`.
* `properties` - (map, optional) additional properties. Currently contains `nodeName` property that * `properties` - (map, optional) Any additional properties. Currently includes `nodeName` property that
indicates the node that was lost or added. indicates the node that was lost or added.
== `.autoAddReplicas` trigger == Auto Add Replicas Trigger
When a collection has a flag `autoAddReplicas` set to true then a trigger configuration named `.auto_add_replicas`
is automatically created to watch for nodes going away. This trigger produces `nodeLost` events, When a collection has the parameter `autoAddReplicas` set to true then a trigger configuration named `.auto_add_replicas` is automatically created to watch for nodes going away. This trigger produces `nodeLost` events,
which are then processed by configured actions (usually resulting in computing and executing a plan which are then processed by configured actions (usually resulting in computing and executing a plan
to add replicas on the live nodes to maintain the expected replication factor). to add replicas on the live nodes to maintain the expected replication factor).
You can see the section <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas,SolrCloud AutoScaling Automatically Adding Replicas>> to learn more about how `.autoAddReplicas` work. You can see the section <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas, Autoscaling Automatically Adding Replicas>> to learn more about how the `.autoAddReplicas` trigger works.
== Trigger configuration == Trigger Configuration
Trigger configurations are managed using autoscaling Write API with commands `set-trigger`, `remove-trigger`, Trigger configurations are managed using the Autoscaling Write API and the commands `set-trigger`, `remove-trigger`,
`suspend-trigger`, `resume-trigger`. `suspend-trigger`, and `resume-trigger`.
Trigger configuration consists of the following properties: Trigger configuration consists of the following properties:
* `name` - (string, required) unique trigger configuration name. * `name` - (string, required) A unique trigger configuration name.
* `event` - (string, required) one of predefined event types (nodeAdded, nodeLost). * `event` - (string, required) One of the predefined event types (`nodeAdded` or `nodeLost`).
* `actions` - (list of action configs, optional) ordered list of actions to execute when event is fired * `actions` - (list of action configs, optional) An ordered list of actions to execute when event is fired.
* `waitFor` - (string, optional) time to wait between generating new events, as an integer number immediately followed * `waitFor` - (string, optional) The time to wait between generating new events, as an integer number immediately followed by unit symbol, one of `s` (seconds), `m` (minutes), or `h` (hours). Default is `0s`.
by unit symbol, one of "s" (seconds), "m" (minutes), or "h" (hours). Default is "0s". * `enabled` - (boolean, optional) When `true` the trigger is enabled. Default is `true`.
* `enabled` - (boolean, optional) when true the trigger is enabled. Default is true. * Additional implementation-specific properties may be provided.
* additional implementation-specific properties may be provided
Action configuration consists of the following properties: Action configuration consists of the following properties:
* `name` - (string, required) unique name of the action configuration. * `name` - (string, required) A unique name of the action configuration.
* `class` - (string, required) action implementation class * `class` - (string, required) The action implementation class.
* additional implementation-specific properties may be provided * A dditional implementation-specific properties may be provided
If the Action configuration is omitted, then by default, the ComputePlanAction and the ExecutePlanAction are automatically If the Action configuration is omitted, then by default, the `ComputePlanAction` and the `ExecutePlanAction` are automatically added to the trigger configuration.
added to the trigger configuration.
Example: adding / updating a trigger for `nodeAdded` events. This trigger configuration will .Example: adding or updating a trigger for `nodeAdded` events
compute and execute a plan to allocate the resources available on the new node. A custom action
is also used to possibly modify the plan.
[source,json] [source,json]
---- ----
{ {
@ -110,3 +107,4 @@ is also used to possibly modify the plan.
} }
---- ----
This trigger configuration will compute and execute a plan to allocate the resources available on the new node. A custom action is also used to possibly modify the plan.

View File

@ -1,4 +1,4 @@
= SolrCloud AutoScaling = SolrCloud Autoscaling
:page-shortname: solrcloud-autoscaling :page-shortname: solrcloud-autoscaling
:page-permalink: solrcloud-autoscaling.html :page-permalink: solrcloud-autoscaling.html
:page-children: solrcloud-autoscaling-overview, solrcloud-autoscaling-api, solrcloud-autoscaling-policy-preferences, solrcloud-autoscaling-triggers, solrcloud-autoscaling-trigger-actions, solrcloud-autoscaling-listeners, solrcloud-autoscaling-auto-add-replicas, solrcloud-autoscaling-fault-tolerance :page-children: solrcloud-autoscaling-overview, solrcloud-autoscaling-api, solrcloud-autoscaling-policy-preferences, solrcloud-autoscaling-triggers, solrcloud-autoscaling-trigger-actions, solrcloud-autoscaling-listeners, solrcloud-autoscaling-auto-add-replicas, solrcloud-autoscaling-fault-tolerance
@ -27,10 +27,10 @@ Autoscaling includes an API to manage cluster-wide and collection-specific polic
The following sections describe the autoscaling features of SolrCloud: The following sections describe the autoscaling features of SolrCloud:
* <<solrcloud-autoscaling-overview.adoc#solrcloud-autoscaling-overview,Overview of Autoscaling in SolrCloud>> * <<solrcloud-autoscaling-overview.adoc#solrcloud-autoscaling-overview,Overview of Autoscaling in SolrCloud>>
* <<solrcloud-autoscaling-api.adoc#solrcloud-autoscaling-api,SolrCloud Autoscaling API>> * <<solrcloud-autoscaling-api.adoc#solrcloud-autoscaling-api,Autoscaling API>>
* <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,SolrCloud Autoscaling Policy and Preferences>> * <<solrcloud-autoscaling-policy-preferences.adoc#solrcloud-autoscaling-policy-preferences,Autoscaling Policy and Preferences>>
* <<solrcloud-autoscaling-triggers.adoc#solrcloud-autoscaling-triggers,SolrCloud AutoScaling Triggers>> * <<solrcloud-autoscaling-triggers.adoc#solrcloud-autoscaling-triggers,Autoscaling Triggers>>
* <<solrcloud-autoscaling-trigger-actions.adoc#solrcloud-autoscaling-trigger-actions,SolrCloud AutoScaling Trigger Actions>> * <<solrcloud-autoscaling-trigger-actions.adoc#solrcloud-autoscaling-trigger-actions,Autoscaling Trigger Actions>>
* <<solrcloud-autoscaling-listeners.adoc#solrcloud-autoscaling-listeners,SolrCloud AutoScaling Listeners>> * <<solrcloud-autoscaling-listeners.adoc#solrcloud-autoscaling-listeners,Autoscaling Listeners>>
* <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas,SolrCloud AutoScaling - Automatically Adding Replicas>> * <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas,Autoscaling - Automatically Adding Replicas>>
* <<solrcloud-autoscaling-fault-tolerance.adoc#solrcloud-autoscaling-fault-tolerance,SolrCloud AutoScaling Fault Tolerance>> * <<solrcloud-autoscaling-fault-tolerance.adoc#solrcloud-autoscaling-fault-tolerance,Autoscaling Fault Tolerance>>

View File

@ -73,7 +73,7 @@ The center point using the format "lat,lon" if latitude & longitude. Otherwise,
A spatial indexed field. A spatial indexed field.
`score`:: `score`::
(Advanced option; not supported by LatLonType (deprecated) or PointType) If the query is used in a scoring context (e.g. as the main query in `q`), this _<<local-parameters-in-queries.adoc#local-parameters-in-queries,local parameter>>_ determines what scores will be produced. Valid values are: (Advanced option; not supported by LatLonType (deprecated) or PointType) If the query is used in a scoring context (e.g., as the main query in `q`), this _<<local-parameters-in-queries.adoc#local-parameters-in-queries,local parameter>>_ determines what scores will be produced. Valid values are:
* `none`: A fixed score of 1.0. (the default) * `none`: A fixed score of 1.0. (the default)
* `kilometers`: distance in kilometers between the field value and the specified center point * `kilometers`: distance in kilometers between the field value and the specified center point
@ -84,7 +84,7 @@ A spatial indexed field.
+ +
[WARNING] [WARNING]
==== ====
Don't use this for indexed non-point shapes (e.g. polygons). The results will be erroneous. And with RPT, it's only recommended for multi-valued point data, as the implementation doesn't scale very well and for single-valued fields, you should instead use a separate non-RPT field purely for distance sorting. Don't use this for indexed non-point shapes (e.g., polygons). The results will be erroneous. And with RPT, it's only recommended for multi-valued point data, as the implementation doesn't scale very well and for single-valued fields, you should instead use a separate non-RPT field purely for distance sorting.
==== ====
+ +
When used with `BBoxField`, additional options are supported: When used with `BBoxField`, additional options are supported:
@ -129,7 +129,7 @@ Here's an example:
`&q=*:*&fq=store:[45,-94 TO 46,-93]` `&q=*:*&fq=store:[45,-94 TO 46,-93]`
LatLonType (deprecated) does *not* support rectangles that cross the dateline. For RPT and BBoxField, if you are non-geospatial coordinates (`geo="false"`) then you must quote the points due to the space, e.g. `"x y"`. LatLonType (deprecated) does *not* support rectangles that cross the dateline. For RPT and BBoxField, if you are non-geospatial coordinates (`geo="false"`) then you must quote the points due to the space, e.g., `"x y"`.
=== Optimizing: Cache or Not === Optimizing: Cache or Not
@ -193,7 +193,7 @@ RPT offers several functional improvements over LatLonPointSpatialField:
* Non-geodetic geo=false general x & y (_not_ latitude and longitude) -- if desired * Non-geodetic geo=false general x & y (_not_ latitude and longitude) -- if desired
* Query by polygons and other complex shapes, in addition to circles & rectangles * Query by polygons and other complex shapes, in addition to circles & rectangles
* Ability to index non-point shapes (e.g. polygons) as well as points see RptWithGeometrySpatialField * Ability to index non-point shapes (e.g., polygons) as well as points see RptWithGeometrySpatialField
* Heatmap grid faceting * Heatmap grid faceting
RPT _shares_ various features in common with `LatLonPointSpatialField`. Some are listed here: RPT _shares_ various features in common with `LatLonPointSpatialField`. Some are listed here:
@ -411,7 +411,7 @@ BBoxField is actually based off of 4 instances of another field type referred to
To index a box, add a field value to a bbox field that's a string in the WKT/CQL ENVELOPE syntax. Example: `ENVELOPE(-10, 20, 15, 10)` which is minX, maxX, maxY, minY order. The parameter ordering is unintuitive but that's what the spec calls for. Alternatively, you could provide a rectangular polygon in WKT (or GeoJSON if you set set `format="GeoJSON"`). To index a box, add a field value to a bbox field that's a string in the WKT/CQL ENVELOPE syntax. Example: `ENVELOPE(-10, 20, 15, 10)` which is minX, maxX, maxY, minY order. The parameter ordering is unintuitive but that's what the spec calls for. Alternatively, you could provide a rectangular polygon in WKT (or GeoJSON if you set set `format="GeoJSON"`).
To search, you can use the `{!bbox}` query parser, or the range syntax e.g. `[10,-10 TO 15,20]`, or the ENVELOPE syntax wrapped in parenthesis with a leading search predicate. The latter is the only way to choose a predicate other than Intersects. For example: To search, you can use the `{!bbox}` query parser, or the range syntax e.g., `[10,-10 TO 15,20]`, or the ENVELOPE syntax wrapped in parenthesis with a leading search predicate. The latter is the only way to choose a predicate other than Intersects. For example:
[source,plain] [source,plain]
&q={!field f=bbox}Contains(ENVELOPE(-10, 20, 15, 10)) &q={!field f=bbox}Contains(ENVELOPE(-10, 20, 15, 10))

View File

@ -22,7 +22,7 @@
== cartesianProduct == cartesianProduct
The `cartesianProduct` function turns a single tuple with a multi-valued field (ie. an array) into multiple tuples, one for each value in the array field. That is, given a single tuple containing an array of N values for fieldA, the `cartesianProduct` function will output N tuples, each with one value from the original tuple's array. In essence, you can flatten arrays for further processing. The `cartesianProduct` function turns a single tuple with a multi-valued field (i.e., an array) into multiple tuples, one for each value in the array field. That is, given a single tuple containing an array of N values for fieldA, the `cartesianProduct` function will output N tuples, each with one value from the original tuple's array. In essence, you can flatten arrays for further processing.
For example, using `cartesianProduct` you can turn this tuple For example, using `cartesianProduct` you can turn this tuple
[source,text] [source,text]

View File

@ -41,7 +41,7 @@ In addition to all the <<the-dismax-query-parser.adoc#dismax-query-parser-parame
Split on whitespace. If set to `true`, text analysis is invoked separately for each individual whitespace-separated term. The default is `false`; whitespace-separated term sequences will be provided to text analysis in one shot, enabling proper function of analysis filters that operate over term sequences, e.g., multi-word synonyms and shingles. Split on whitespace. If set to `true`, text analysis is invoked separately for each individual whitespace-separated term. The default is `false`; whitespace-separated term sequences will be provided to text analysis in one shot, enabling proper function of analysis filters that operate over term sequences, e.g., multi-word synonyms and shingles.
`mm.autoRelax`:: `mm.autoRelax`::
If `true`, the number of clauses required (<<the-dismax-query-parser.adoc#mm-minimum-should-match-parameter,minimum should match>>) will automatically be relaxed if a clause is removed (by e.g. stopwords filter) from some but not all <<the-dismax-query-parser.adoc#qf-query-fields-parameter,`qf`>> fields. Use this parameter as a workaround if you experience that queries return zero hits due to uneven stopword removal between the `qf` fields. If `true`, the number of clauses required (<<the-dismax-query-parser.adoc#mm-minimum-should-match-parameter,minimum should match>>) will automatically be relaxed if a clause is removed (by e.g., stopwords filter) from some but not all <<the-dismax-query-parser.adoc#qf-query-fields-parameter,`qf`>> fields. Use this parameter as a workaround if you experience that queries return zero hits due to uneven stopword removal between the `qf` fields.
+ +
Note that relaxing `mm` may cause undesired side effects, such as hurting the precision of the search, depending on the nature of your index content. Note that relaxing `mm` may cause undesired side effects, such as hurting the precision of the search, depending on the nature of your index content.

View File

@ -72,8 +72,8 @@ If you are worried about the SolrJ libraries expanding the size of your client a
Most `SolrClient` implementations (with the notable exception of `CloudSolrClient`) require users to specify one or more Solr base URLs, which the client then uses to send HTTP requests to Solr. The path users include on the base URL they provide has an effect on the behavior of the created client from that point on. Most `SolrClient` implementations (with the notable exception of `CloudSolrClient`) require users to specify one or more Solr base URLs, which the client then uses to send HTTP requests to Solr. The path users include on the base URL they provide has an effect on the behavior of the created client from that point on.
. A URL with a path pointing to a specific core or collection (e.g. `http://hostname:8983/solr/core1`). When a core or collection is specified in the base URL, subsequent requests made with that client are not required to re-specify the affected collection. However, the client is limited to sending requests to that core/collection, and can not send requests to any others. . A URL with a path pointing to a specific core or collection (e.g., `http://hostname:8983/solr/core1`). When a core or collection is specified in the base URL, subsequent requests made with that client are not required to re-specify the affected collection. However, the client is limited to sending requests to that core/collection, and can not send requests to any others.
. A URL with a generic path pointing to the root Solr path (e.g. `http://hostname:8983/solr`). When no core or collection is specified in the base URL, requests can be made to any core/collection, but the affected core/collection must be specified on all requests. . A URL with a generic path pointing to the root Solr path (e.g., `http://hostname:8983/solr`). When no core or collection is specified in the base URL, requests can be made to any core/collection, but the affected core/collection must be specified on all requests.
== Setting XMLResponseParser == Setting XMLResponseParser

View File

@ -70,7 +70,7 @@ You must specify parameters `amountLongSuffix` and `codeStrSuffix`, correspondin
defaultCurrency="USD" currencyConfig="currency.xml" /> defaultCurrency="USD" currencyConfig="currency.xml" />
---- ----
In the above example, the raw amount field will use the `"*_l_ns"` dynamic field, which must exist in the schema and use a long field type, i.e. one that extends `LongValueFieldType`. The currency code field will use the `"*_s_ns"` dynamic field, which must exist in the schema and use a string field type, i.e. one that is or extends `StrField`. In the above example, the raw amount field will use the `"*_l_ns"` dynamic field, which must exist in the schema and use a long field type, i.e., one that extends `LongValueFieldType`. The currency code field will use the `"*_s_ns"` dynamic field, which must exist in the schema and use a string field type, i.e., one that is or extends `StrField`.
.Atomic Updates won't work if dynamic sub-fields are stored .Atomic Updates won't work if dynamic sub-fields are stored
[NOTE] [NOTE]

View File

@ -62,7 +62,7 @@ These are valid queries: +
Solr's `DateRangeField` supports the same point in time date syntax described above (with _date math_ described below) and more to express date ranges. One class of examples is truncated dates, which represent the entire date span to the precision indicated. The other class uses the range syntax (`[ TO ]`). Here are some examples: Solr's `DateRangeField` supports the same point in time date syntax described above (with _date math_ described below) and more to express date ranges. One class of examples is truncated dates, which represent the entire date span to the precision indicated. The other class uses the range syntax (`[ TO ]`). Here are some examples:
* `2000-11` The entire month of November, 2000. * `2000-11` The entire month of November, 2000.
* `2000-11T13` Likewise but for an hour of the day (1300 to before 1400, i.e. 1pm to 2pm). * `2000-11T13` Likewise but for an hour of the day (1300 to before 1400, i.e., 1pm to 2pm).
* `-0009` The year 10 BC. A 0 in the year position is 0 AD, and is also considered 1 BC. * `-0009` The year 10 BC. A 0 in the year position is 0 AD, and is also considered 1 BC.
* `[2000-11-01 TO 2014-12-01]` The specified date range at a day resolution. * `[2000-11-01 TO 2014-12-01]` The specified date range at a day resolution.
* `[2014 TO 2014-12-01]` From the start of 2014 till the end of the first day of December. * `[2014 TO 2014-12-01]` From the start of 2014 till the end of the first day of December.

View File

@ -179,7 +179,7 @@ Special characters in "text" values can be escaped using the escape character `\
|`\t` |horizontal tab |`\t` |horizontal tab
|=== |===
Please note that Unicode sequences (e.g. `\u0001`) are not supported. Please note that Unicode sequences (e.g., `\u0001`) are not supported.
==== Supported Attributes ==== Supported Attributes

View File

@ -51,11 +51,11 @@ We want to be able to:
. Control which ACLs Solr will add to znodes (ZooKeeper files/folders) it creates in ZooKeeper. . Control which ACLs Solr will add to znodes (ZooKeeper files/folders) it creates in ZooKeeper.
. Control it "from the outside", so that you do not have to modify and/or recompile Solr code to turn this on. . Control it "from the outside", so that you do not have to modify and/or recompile Solr code to turn this on.
Solr nodes, clients and tools (e.g. ZkCLI) always use a java class called {solr-javadocs}/solr-solrj/org/apache/solr/common/cloud/SolrZkClient.html[`SolrZkClient`] to deal with their ZooKeeper stuff. The implementation of the solution described here is all about changing `SolrZkClient`. If you use `SolrZkClient` in your application, the descriptions below will be true for your application too. Solr nodes, clients and tools (e.g., ZkCLI) always use a java class called {solr-javadocs}/solr-solrj/org/apache/solr/common/cloud/SolrZkClient.html[`SolrZkClient`] to deal with their ZooKeeper stuff. The implementation of the solution described here is all about changing `SolrZkClient`. If you use `SolrZkClient` in your application, the descriptions below will be true for your application too.
=== Controlling Credentials === Controlling Credentials
You control which credentials provider will be used by configuring the `zkCredentialsProvider` property in `solr.xml` 's `<solrcloud>` section to the name of a class (on the classpath) implementing the {solr-javadocs}/solr-solrj/org/apache/solr/common/cloud/ZkCredentialsProvider[`ZkCredentialsProvider`] interface. `server/solr/solr.xml` in the Solr distribution defines the `zkCredentialsProvider` such that it will take on the value of the same-named `zkCredentialsProvider` system property if it is defined (e.g. by uncommenting the `SOLR_ZK_CREDS_AND_ACLS` environment variable definition in `solr.in.sh/.cmd` - see below), or if not, default to the `DefaultZkCredentialsProvider` implementation. You control which credentials provider will be used by configuring the `zkCredentialsProvider` property in `solr.xml` 's `<solrcloud>` section to the name of a class (on the classpath) implementing the {solr-javadocs}/solr-solrj/org/apache/solr/common/cloud/ZkCredentialsProvider[`ZkCredentialsProvider`] interface. `server/solr/solr.xml` in the Solr distribution defines the `zkCredentialsProvider` such that it will take on the value of the same-named `zkCredentialsProvider` system property if it is defined (e.g., by uncommenting the `SOLR_ZK_CREDS_AND_ACLS` environment variable definition in `solr.in.sh/.cmd` - see below), or if not, default to the `DefaultZkCredentialsProvider` implementation.
==== Out of the Box Credential Implementations ==== Out of the Box Credential Implementations
@ -68,7 +68,7 @@ You can always make you own implementation, but Solr comes with two implementati
=== Controlling ACLs === Controlling ACLs
You control which ACLs will be added by configuring `zkACLProvider` property in `solr.xml` 's `<solrcloud>` section to the name of a class (on the classpath) implementing the {solr-javadocs}//solr-solrj/org/apache/solr/common/cloud/ZkACLProvider[`ZkACLProvider`] interface. `server/solr/solr.xml` in the Solr distribution defines the `zkACLProvider` such that it will take on the value of the same-named `zkACLProvider` system property if it is defined (e.g. by uncommenting the `SOLR_ZK_CREDS_AND_ACLS` environment variable definition in `solr.in.sh/.cmd` - see below), or if not, default to the `DefaultZkACLProvider` implementation. You control which ACLs will be added by configuring `zkACLProvider` property in `solr.xml` 's `<solrcloud>` section to the name of a class (on the classpath) implementing the {solr-javadocs}//solr-solrj/org/apache/solr/common/cloud/ZkACLProvider[`ZkACLProvider`] interface. `server/solr/solr.xml` in the Solr distribution defines the `zkACLProvider` such that it will take on the value of the same-named `zkACLProvider` system property if it is defined (e.g., by uncommenting the `SOLR_ZK_CREDS_AND_ACLS` environment variable definition in `solr.in.sh/.cmd` - see below), or if not, default to the `DefaultZkACLProvider` implementation.
==== Out of the Box ACL Implementations ==== Out of the Box ACL Implementations
@ -90,7 +90,7 @@ If none of the above ACLs is added to the list, the (empty) ACL list of `Default
Notice the overlap in system property names with credentials provider `VMParamsSingleSetCredentialsDigestZkCredentialsProvider` (described above). This is to let the two providers collaborate in a nice and perhaps common way: we always protect access to content by limiting to two users - an admin-user and a readonly-user - AND we always connect with credentials corresponding to this same admin-user, basically so that we can do anything to the content/znodes we create ourselves. Notice the overlap in system property names with credentials provider `VMParamsSingleSetCredentialsDigestZkCredentialsProvider` (described above). This is to let the two providers collaborate in a nice and perhaps common way: we always protect access to content by limiting to two users - an admin-user and a readonly-user - AND we always connect with credentials corresponding to this same admin-user, basically so that we can do anything to the content/znodes we create ourselves.
You can give the readonly credentials to "clients" of your SolrCloud cluster - e.g. to be used by SolrJ clients. They will be able to read whatever is necessary to run a functioning SolrJ client, but they will not be able to modify any content in ZooKeeper. You can give the readonly credentials to "clients" of your SolrCloud cluster - e.g., to be used by SolrJ clients. They will be able to read whatever is necessary to run a functioning SolrJ client, but they will not be able to modify any content in ZooKeeper.
=== ZooKeeper ACLs in Solr Scripts === ZooKeeper ACLs in Solr Scripts