Ref Guide: fix typos & standardize spellings

This commit is contained in:
Cassandra Targett 2019-12-17 12:59:17 -06:00
parent 3d4246089f
commit 3e8872738a
31 changed files with 72 additions and 70 deletions

View File

@ -76,7 +76,7 @@ When adding data, you should usually direct documents to the alias (e.g., refere
The Solr server and `CloudSolrClient` will direct an update request to the first collection that an alias points to.
Once the server receives the data it will perform the necessary routing.
WARNING: It's extremely important with all routed aliases that the route values NOT change. Re-indexing a document
WARNING: It's extremely important with all routed aliases that the route values NOT change. Reindexing a document
with a different route value for the same ID produces two distinct documents with the same ID accessible via the alias.
All query time behavior of the routed alias is *_undefined_* and not easily predictable once duplicate ID's exist.

View File

@ -184,6 +184,6 @@ Each Authentication plugin may also decide to secure inter-node requests on its
The `PKIAuthenticationPlugin` provides a built-in authentication mechanism where each Solr node is a super user and is fully trusted by other Solr nodes through the use of Public Key Infrastructure (PKI). Each Authentication plugn may choose to delegate all or some inter-node traffic to the PKI plugin.
For each outgoing request `PKIAuthenticationPlugin` adds a special header `'SolrAuth'` which carries the timestamp and principal encrypted using the private key of that node. The public key is exposed through an API so that any node can read it whenever it needs it. Any node who gets the request with that header, would get the public key from the sender and decrypt the information. If it is able to decrypt the data, the request trusted. It is invalid if the timestamp is more than 5 secs old. This assumes that the clocks of different nodes in the cluster are synchronized. Only traffic from other Solr nodes registered with Zookeeper is trusted.
For each outgoing request `PKIAuthenticationPlugin` adds a special header `'SolrAuth'` which carries the timestamp and principal encrypted using the private key of that node. The public key is exposed through an API so that any node can read it whenever it needs it. Any node who gets the request with that header, would get the public key from the sender and decrypt the information. If it is able to decrypt the data, the request trusted. It is invalid if the timestamp is more than 5 secs old. This assumes that the clocks of different nodes in the cluster are synchronized. Only traffic from other Solr nodes registered with ZooKeeper is trusted.
The timeout is configurable through a system property called `pkiauth.ttl`. For example, if you wish to bump up the time-to-live to 10 seconds (10000 milliseconds), start each node with a property `'-Dpkiauth.ttl=10000'`.

View File

@ -168,7 +168,7 @@ http://localhost:8983/solr/admin/collections?action=CLUSTERPROP&name=urlScheme&v
It is possible to set cluster-wide default values for certain attributes of a collection, using the `defaults` parameter.
*Set/update default values*
[source]
[source,bash]
----
curl -X POST -H 'Content-type:application/json' --data-binary '
{
@ -186,7 +186,7 @@ curl -X POST -H 'Content-type:application/json' --data-binary '
----
*Unset the only value of `nrtReplicas`*
[source]
[source,bash]
----
curl -X POST -H 'Content-type:application/json' --data-binary '
{
@ -201,7 +201,7 @@ curl -X POST -H 'Content-type:application/json' --data-binary '
----
*Unset all values in `defaults`*
[source]
[source,bash]
----
curl -X POST -H 'Content-type:application/json' --data-binary '
{ "set-obj-property" : {

View File

@ -660,10 +660,10 @@ http://localhost:8983/solr/admin/collections?action=MIGRATE&collection=test1&spl
`/admin/collections?action=REINDEXCOLLECTION&name=_name_`
The REINDEXCOLLECTION command re-indexes a collection using existing data from the
The REINDEXCOLLECTION command reindexes a collection using existing data from the
source collection.
NOTE: Re-indexing is potentially a lossy operation - some of the existing indexed data that is not
NOTE: Reindexing is potentially a lossy operation - some of the existing indexed data that is not
available as stored fields may be lost, so users should use this command
with caution, evaluating the potential impact by using different source and target
collection names first, and preserving the source collection until the evaluation is
@ -671,19 +671,19 @@ complete.
The target collection must not exist (and may not be an alias). If the target
collection name is the same as the source collection then first a unique sequential name
will be generated for the target collection, and then after re-indexing is done an alias
will be generated for the target collection, and then after reindexing is done an alias
will be created that points from the source name to the actual sequentially-named target collection.
When re-indexing is started the source collection is put in <<readonlymode,read-only mode>> to ensure that
When reindexing is started the source collection is put in <<readonlymode,read-only mode>> to ensure that
all source documents are properly processed.
Using optional parameters a different index schema, collection shape (number of shards and replicas)
or routing parameters can be requested for the target collection.
Re-indexing is executed as a streaming expression daemon, which runs on one of the
Reindexing is executed as a streaming expression daemon, which runs on one of the
source collection's replicas. It is usually a time-consuming operation so it's recommended to execute
it as an asynchronous request in order to avoid request time outs. Only one re-indexing operation may
execute concurrently for a given source collection. Long-running, erroneous or crashed re-indexing
it as an asynchronous request in order to avoid request time outs. Only one reindexing operation may
execute concurrently for a given source collection. Long-running, erroneous or crashed reindexing
operations may be terminated by using the `abort` option, which also removes partial results.
=== REINDEXCOLLECTION Parameters
@ -694,9 +694,9 @@ Source collection name, may be an alias. This parameter is required.
`cmd`::
Optional command. Default command is `start`. Currently supported commands are:
* `start` - default, starts processing if not already running,
* `abort` - aborts an already running re-indexing (or clears a left-over status after a crash),
* `abort` - aborts an already running reindexing (or clears a left-over status after a crash),
and deletes partial results,
* `status` - returns detailed status of a running re-indexing command.
* `status` - returns detailed status of a running reindexing command.
`target`::
Target collection name, optional. If not specified a unique name will be generated and
@ -705,10 +705,10 @@ collection name to the unique sequentially-named collection, effectively "hiding
the original source collection from regular update and search operations.
`q`::
Optional query to select documents for re-indexing. Default value is `\*:*`.
Optional query to select documents for reindexing. Default value is `\*:*`.
`fl`::
Optional list of fields to re-index. Default value is `*`.
Optional list of fields to reindex. Default value is `*`.
`rows`::
Documents are transferred in batches. Depending on the average size of the document large
@ -732,7 +732,7 @@ be deleted.
`async`::
Optional request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
When the re-indexing process has completed the target collection is marked using
When the reindexing process has completed the target collection is marked using
`property.rx: "finished"`, and the source collection state is updated to become read-write.
On any errors the command will delete any temporary and target collections and also reset the
state of the source collection's read-only flag.
@ -770,9 +770,9 @@ for the purpose of indexing and searching. The source collection is assumed to b
}
}
----
As a result a new collection `.rx_newCollection_2` has been created, with selected documents re-indexed to 3 shards, and
As a result a new collection `.rx_newCollection_2` has been created, with selected documents reindexed to 3 shards, and
with an alias pointing from `newCollection` to this one. The status also shows that the source collection
was already an alias to `.rx_newCollection_1`, which was likely a result of a previous re-indexing.
was already an alias to `.rx_newCollection_1`, which was likely a result of a previous reindexing.
[[colstatus]]
== COLSTATUS: Detailed Status of a Collection's Indexes
@ -860,7 +860,7 @@ This section also shows `topN` values by size from each field.
Data types shown in the response can be roughly divided into the following groups:
* `storedFields` - represents the raw uncompressed data in stored fields. Eg. for UTF-8 strings this represents
* `storedFields` - represents the raw uncompressed data in stored fields. For example, for UTF-8 strings this represents
the aggregated sum of the number of bytes in the strings' UTF-8 representation, for long numbers this is 8 bytes per value, etc.
* `terms_terms` - represents the aggregated size of the term dictionary. The size of this data is affected by the
@ -876,7 +876,7 @@ has an `omitNorms` flag in the schema, which is common for fields that don't nee
* `termVectors` - represents the aggregated size of term vectors.
* `docValues_*` - represents aggregated size of doc values, by type (eg. `docValues_numeric`, `docValues_binary`, etc).
* `docValues_*` - represents aggregated size of doc values, by type (e.g., `docValues_numeric`, `docValues_binary`, etc).
* `points` - represents aggregated size of point values.

View File

@ -128,7 +128,7 @@ Using the boostrap command with a ZooKeeper chroot in the `-zkhost` parameter, e
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd putfile /my_zk_file.txt /tmp/my_local_file.txt
----
=== Link a Collection to a ConfigSet
=== Link a Collection to a Configset
[source,bash]
----

View File

@ -277,7 +277,7 @@ fq=quantity_in_stock:[5 TO *]
fq={!frange l=10 u=100}mul(popularity,price)
fq={!frange cost=200 l=0}pow(mul(sum(1, query('tag:smartphone')), div(1,avg_rating)), 2.3)
These are the same filters run w/o caching. The simple range query on the `quantity_in_stock` field will be run in parallel with the main query like a traditional lucene filter, while the 2 `frange` filters will only be checked against each document has already matched the main query and the `quantity_in_stock` range query -- first the simpler `mul(popularity,price)` will be checked (because of its implicit `cost=100`) and only if it matches will the final very complex filter (with its higher `cost=200`) be checked.
These are the same filters run w/o caching. The simple range query on the `quantity_in_stock` field will be run in parallel with the main query like a traditional Lucene filter, while the 2 `frange` filters will only be checked against each document has already matched the main query and the `quantity_in_stock` range query -- first the simpler `mul(popularity,price)` will be checked (because of its implicit `cost=100`) and only if it matches will the final very complex filter (with its higher `cost=200`) be checked.
[source,text]
q=some keywords

View File

@ -1,4 +1,4 @@
= Config Sets
= Configsets
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information

View File

@ -27,7 +27,7 @@ Once a configset has been uploaded to ZooKeeper, use the configset name when cre
Configsets do not have to be shared between collections if they are uploaded with this API, but this API makes it easier to do so if you wish. An alternative to uploading your configsets in advance would be to put the configuration files into a directory under `server/solr/configsets` and using the directory name as the `-d` parameter when using `bin/solr create` to create a collection.
NOTE: This API can only be used with Solr running in SolrCloud mode. If you are not running Solr in SolrCloud mode but would still like to use shared configurations, please see the section <<config-sets.adoc#config-sets,Config Sets>>.
NOTE: This API can only be used with Solr running in SolrCloud mode. If you are not running Solr in SolrCloud mode but would still like to use shared configurations, please see the section <<config-sets.adoc#config-sets,Configsets>>.
The API works by passing commands to the `configs` endpoint. The path to the endpoint varies depending on the API being used: the v1 API uses `solr/admin/configs`, while the v2 API uses `api/cluster/configs`. Examples of both types are provided below.

View File

@ -60,7 +60,7 @@ Your CREATE call must be able to find a configuration, or it will not succeed.
When you are running SolrCloud and create a new core for a collection, the configuration will be inherited from the collection. Each collection is linked to a configName, which is stored in ZooKeeper. This satisfies the configuration requirement. There is something to note, though: if you're running SolrCloud, you should *NOT* use the CoreAdmin API at all. Use the <<collections-api.adoc#collections-api,Collections API>>.
When you are not running SolrCloud, if you have <<config-sets.adoc#config-sets,Config Sets>> defined, you can use the `configSet` parameter as documented below. If there are no configsets, then the `instanceDir` specified in the CREATE call must already exist, and it must contain a `conf` directory which in turn must contain `solrconfig.xml`, your schema (usually named either `managed-schema` or `schema.xml`), and any files referenced by those configs.
When you are not running SolrCloud, if you have <<config-sets.adoc#config-sets,Configsets>> defined, you can use the `configSet` parameter as documented below. If there are no configsets, then the `instanceDir` specified in the CREATE call must already exist, and it must contain a `conf` directory which in turn must contain `solrconfig.xml`, your schema (usually named either `managed-schema` or `schema.xml`), and any files referenced by those configs.
The config and schema filenames can be specified with the `config` and `schema` parameters, but these are expert options. One thing you could do to avoid creating the `conf` directory is use `config` and `schema` parameters that point at absolute paths, but this can lead to confusing configurations unless you fully understand what you are doing.
====
@ -89,7 +89,7 @@ Name of the schema file to use for the core. Please note that if you are using a
Name of the data directory relative to `instanceDir`.
`configSet`::
Name of the configset to use for this core. For more information, see the section <<config-sets.adoc#config-sets,Config Sets>>.
Name of the configset to use for this core. For more information, see the section <<config-sets.adoc#config-sets,Configsets>>.
`collection`::
The name of the collection to which this core belongs. The default is the name of the core. `collection._param_=_value_` causes a property of `_param_=_value_` to be set if a new collection is being created. Use `collection.configName=_config-name_` to point to the configuration for a new collection.

View File

@ -74,7 +74,7 @@ The following properties are available:
`dataDir`:: The core's data directory (where indexes are stored) as either an absolute pathname, or a path relative to the value of `instanceDir`. This is `data` by default.
`configSet`:: The name of a defined configset, if desired, to use to configure the core (see the section <<config-sets.adoc#config-sets,Config Sets>> for more details).
`configSet`:: The name of a defined configset, if desired, to use to configure the core (see the section <<config-sets.adoc#config-sets,Configsets>> for more details).
`properties`:: The name of the properties file for this core. The value can be an absolute pathname or a path relative to the value of `instanceDir`.

View File

@ -185,9 +185,9 @@ Applied after sorting by inherent replica attributes, this property defines a fa
+
`random`, the default, randomly shuffles replicas for each request. This distributes requests evenly, but can result in sub-optimal cache usage for shards with replication factor > 1.
+
`stable:dividend:_paramName_` parses an integer from the value associated with the given param name; this integer is used as the dividend (mod equivalent replica count) to determine (via list rotation) order of preference among equivalent replicas.
`stable:dividend:_paramName_` parses an integer from the value associated with the given parameter name; this integer is used as the dividend (mod equivalent replica count) to determine (via list rotation) order of preference among equivalent replicas.
+
`stable[:hash[:_paramName_]]` the string value associated with the given param name is hashed to a dividend that is used to determine replica preference order (analogous to the explicit `dividend` property above); `_paramName_` defaults to `q` if not specified, providing stable routing keyed to the string value of the "main query". Note that this may be inappropriate for some use cases (e.g., static main queries that leverage parameter substitution)
`stable[:hash[:_paramName_]]` the string value associated with the given parameter name is hashed to a dividend that is used to determine replica preference order (analogous to the explicit `dividend` property above); `_paramName_` defaults to `q` if not specified, providing stable routing keyed to the string value of the "main query". Note that this may be inappropriate for some use cases (e.g., static main queries that leverage parameter substitution)
Examples:

View File

@ -334,7 +334,7 @@ NOTE: If your operating system does not include curl, you can download binaries
=== Create a SolrCloud Collection using bin/solr
Create a 2-shard, replicationFactor=1 collection named mycollection using the default configset (_default):
Create a 2-shard, replicationFactor=1 collection named mycollection using the `_default` configset:
.*nix command
[source,bash]

View File

@ -100,7 +100,7 @@ Sets the maximum number of clauses allowed in any boolean query.
+
This global limit provides a safety constraint on the number of clauses allowed in any boolean queries against any collection -- regardless of whether those clauses were explicitly specified in a query string, or were the result of query expansion/re-writing from a more complex type of query based on the terms in the index.
+
In default configurations this property uses the value of the `solr.max.booleanClauses` system property if specified. This is the same system property used in the default configset for the <<query-settings-in-solrconfig#maxbooleanclauses,`<maxBooleanClauses>` setting of `solrconfig.xml`>> making it easy for Solr administrators to increase both values (in all collections) without needing to search through and update all of their configs.
In default configurations this property uses the value of the `solr.max.booleanClauses` system property if specified. This is the same system property used in the `_default` configset for the <<query-settings-in-solrconfig#maxbooleanclauses,`<maxBooleanClauses>` setting of `solrconfig.xml`>> making it easy for Solr administrators to increase both values (in all collections) without needing to search through and update all of their configs.
+
[source,xml]
----
@ -216,7 +216,9 @@ A NamedList specifying replica routing preference configuration. This may be use
</lst>
</shardHandlerFactory>
----
Replica routing may also be specified (overriding defaults) per-request, via the `shards.preference` request parameter. If a request contains both dividend param and hash param, dividend param takes priority for routing. For configuring `stable` routing, the `hash` param implicitly defaults to a hash of the String value of the main query param (i.e., `q`). `dividend` param must be configured explicitly; there is no implicit default. If only `dividend` routing is desired, `hash` may be explicitly set to the empty string, entirely disabling implicit hash-based routing.
Replica routing may also be specified (overriding defaults) per-request, via the `shards.preference` request parameter. If a request contains both `dividend` and `hash`, `dividend` takes priority for routing. For configuring `stable` routing, the `hash` parameter implicitly defaults to a hash of the String value of the main query parameter (i.e., `q`).
+
The `dividend` parameter must be configured explicitly; there is no implicit default. If only `dividend` routing is desired, `hash` may be explicitly set to the empty string, entirely disabling implicit hash-based routing.
=== The <metrics> Element

View File

@ -47,7 +47,7 @@ Logging:: Retrieve and modify registered loggers.
v2: `api/node/logging` |{solr-javadocs}/solr-core/org/apache/solr/handler/admin/ShowFileRequestHandler.html[LoggingHandler] |`_ADMIN_LOGGING`
|===
Luke:: Expose the internal lucene index. This handler must have a collection name in the path to the endpoint.
Luke:: Expose the internal Lucene index. This handler must have a collection name in the path to the endpoint.
+
*Documentation*: http://wiki.apache.org/solr/LukeRequestHandler
+

View File

@ -270,7 +270,7 @@ include::{example-source-dir}JsonRequestApiTest.java[tag=solrj-ipod-query-bool]
====
--
If lucene is the default query parser, the example above can be simplified to:
If `lucene` is the default query parser, the example above can be simplified to:
[.dynamic-tabs]
--

View File

@ -163,7 +163,7 @@ Let's comment on this config:
== Editing JWT Authentication Plugin Configuration
All properties mentioned above can be set or changed using the Config Edit API. You can thus start with a simple configuration with only `class` and `blockUnknown=false` configured and then configure the rest using the API.
All properties mentioned above can be set or changed using the <<basic-authentication-plugin.adoc#editing-basic-authentication-plugin-configuration,Authentication API>>. You can thus start with a simple configuration with only `class` and `blockUnknown=false` configured and then configure the rest using the API.
=== Set a Configuration Property

View File

@ -67,7 +67,7 @@ See the section <<solrcloud-autoscaling.adoc#solrcloud-autoscaling,SolrCloud Aut
== Configuration and Default Changes
=== New Default ConfigSet
=== New Default Configset
Several changes have been made to configsets that ship with Solr; not only their content but how Solr behaves in regard to them:
* The `data_driven_configset` and `basic_configset` have been removed, and replaced by the `_default` configset. The `sample_techproducts_configset` also remains, and is designed for use with the example documents shipped with Solr in the `example/exampledocs` directory.
@ -121,7 +121,7 @@ This may affect users who bring up replicas and they are automatically registere
* The `handleSelect` parameter in `solrconfig.xml` now defaults to `false` if the `luceneMatchVersion` is 7.0.0 or above. This causes Solr to ignore the `qt` parameter if it is present in a request. If you have request handlers without a leading '/', you can set `handleSelect="true"` or consider migrating your configuration.
+
The `qt` parameter is still used as a SolrJ special parameter that specifies the request handler (tail URL path) to use.
* The lucenePlusSort query parser (aka the "Old Lucene Query Parser") has been deprecated and is no longer implicitly defined. If you wish to continue using this parser until Solr 8 (when it will be removed), you must register it in your `solrconfig.xml`, as in: `<queryParser name="lucenePlusSort" class="solr.OldLuceneQParserPlugin"/>`.
* The `lucenePlusSort` query parser (aka the "Old Lucene Query Parser") has been deprecated and is no longer implicitly defined. If you wish to continue using this parser until Solr 8 (when it will be removed), you must register it in your `solrconfig.xml`, as in: `<queryParser name="lucenePlusSort" class="solr.OldLuceneQParserPlugin"/>`.
* The name of `TemplateUpdateRequestProcessorFactory` is changed to `template` from `Template` and the name of `AtomicUpdateProcessorFactory` is changed to `atomic` from `Atomic`
** Also, `TemplateUpdateRequestProcessorFactory` now uses `{}` instead of `${}` for `template`.

View File

@ -403,7 +403,7 @@ See the https://wiki.apache.org/solr/ReleaseNote73[7.3 Release Notes] for an ove
When upgrading to Solr 7.3, users should be aware of the following major changes from v7.2:
*ConfigSets*
*Configsets*
* Collections created without specifying a configset name have used a copy of the `_default` configset since Solr 7.0. Before 7.3, the copied configset was named the same as the collection name, but from 7.3 onwards it will be named with a new ".AUTOCREATED" suffix. This is to prevent overwriting custom configset names.

View File

@ -28,7 +28,7 @@ $ bin/solr -c -Denable.packages=true
----
=== Upload Your Keys
Package binaries must be signed with your private keys and ensure your public keys are published in Zookeeper
Package binaries must be signed with your private keys and ensure your public keys are published in ZooKeeper.
*Example*
[source,bash]
@ -36,7 +36,7 @@ Package binaries must be signed with your private keys and ensure your public ke
$ openssl genrsa -out my_key.pem 512
# create the public key in .der format
$ openssl rsa -in my_key.pem -pubout -outform DER -out my_key.der
# upload to Zookeeper
# upload to ZooKeeper
$ server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd makepath /keys/exe/
$ server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd putfile /keys/exe/my_key.der my_key.der
----
@ -118,7 +118,7 @@ For every package / version in the packages definition, there is a unique `SolrR
=== packages.json
The package configurations live in a file called /packages.json in Zookeeper. At any given moment we can have multiple versions of a given package in the package configuration. The system will always use the latest version . Versions are sorted by their numeric value and the biggest is the latest.
The package configurations live in a file called /packages.json in ZooKeeper. At any given moment we can have multiple versions of a given package in the package configuration. The system will always use the latest version . Versions are sorted by their numeric value and the biggest is the latest.
*example:*

View File

@ -48,7 +48,7 @@ Essentially, these are the phases in using the package manager:
=== Starting Solr with Package Management Support
Start all Solr nodes with the `-Denable.packages=true` parameter. There are security consequences in doing so. At a minimum, no unauthorized user should have write access to Zookeeper instances, since it would then be possible to install packages from untrusted sources (e.g. malicious repositories).
Start all Solr nodes with the `-Denable.packages=true` parameter. There are security consequences in doing so. At a minimum, no unauthorized user should have write access to ZooKeeper instances, since it would then be possible to install packages from untrusted sources (e.g. malicious repositories).
[source,bash]
----
@ -149,6 +149,6 @@ $ bin/solr package deploy mypackage:2.0.0 --update -collections mycollection
You can run the `list-deployed` command to verify that this collection is using the newly added version.
== Security
Except the `add-repo` step, all other steps can be executed using a HTTP endpoint in Solr (see <<package-manager-internals.adoc#package-manager-internals,Package Manager internals>>). This step registers the public key of the trusted repository, and hence can only be executed using the package manager (CLI) having direct write access to Zookeeper. Hence, as you can imagine, it is important to protect Zookeeper from unauthorized write access.
Except the `add-repo` step, all other steps can be executed using a HTTP endpoint in Solr (see <<package-manager-internals.adoc#package-manager-internals,Package Manager internals>>). This step registers the public key of the trusted repository, and hence can only be executed using the package manager (CLI) having direct write access to ZooKeeper. Hence, as you can imagine, it is important to protect ZooKeeper from unauthorized write access.
Also, keep in mind, that it is possible to install any package from a trusted and an already added repository. Hence, if you want to use some packages in production, then it is better to setup your own repository and add that to Solr, instead of adding a generic third-party repository that is beyond your administrative control.

View File

@ -102,11 +102,11 @@ they are very different in nature, thus making it difficult to measure the laten
local requests separately. Solr 8.4 introduced additional statistics that help to do this.
These metrics are structured the same as `requestTimes` and `totalTime` metrics above but they use
different full names, eg. `QUERY./select.distrib.requestTimes` and `QUERY./select.local.requestTimes`.
different full names, e.g., `QUERY./select.distrib.requestTimes` and `QUERY./select.local.requestTimes`.
The metrics under the `distrib` path correspond to the time it takes for a (potentially) distributed
request to complete all remote calls plus any local processing, and return the result to the caller.
The metrics under the `local` path correspond to the time it takes for a local call (non-distributed,
i.e. being processed only by the Solr core where the handler operates) to complete.
i.e., being processed only by the Solr core where the handler operates) to complete.
== Update Handler

View File

@ -27,9 +27,9 @@ After you update a resource, you'll typically need to _reload_ the affected coll
Restarting all affected Solr nodes also works.
<<managed-resources.adoc#managed-resources,Managed resources>> can be manipulated via APIs and do not need an explicit reload.
== Resources in ConfigSets
== Resources in Configsets
<<config-sets.adoc#config-sets,ConfigSets>> are the directories containing solrconfig.xml, the schema, and resources referenced by them.
<<config-sets.adoc#config-sets,Configsets>> are the directories containing solrconfig.xml, the schema, and resources referenced by them.
In SolrCloud they are in ZooKeeper whereas in standalone they are on the file system.
In either mode, configSets might be shared or might be dedicated to a configSet.
Prefer to put resources here.

View File

@ -58,7 +58,7 @@ Authorization makes sure that only users with the necessary roles/permissions ca
Audit logging will record an audit trail of incoming reqests to your cluster, such as users being denied access to admin APIs. Learn more about audit logging and how to implement an audit logger plugin here in the <<audit-logging.adoc#audit-logging,Audit Logging>> chapter.
== Securing Zookeeper Traffic
== Securing ZooKeeper Traffic
Zookeeper is a central and important part of a SolrCloud cluster and understanding how to secure
ZooKeeper is a central and important part of a SolrCloud cluster and understanding how to secure
its content is covered in the <<zookeeper-access-control.adoc#zookeeper-access-control,ZooKeeper Access Control>> page.

View File

@ -33,4 +33,4 @@ The following sections describe these options in more detail.
* *<<format-of-solr-xml.adoc#format-of-solr-xml,Format of solr.xml>>*: Details on how to define `solr.xml`, including the acceptable parameters for the `solr.xml` file
* *<<defining-core-properties.adoc#defining-core-properties,Defining core.properties>>*: Details on placement of `core.properties` and available property options.
* *<<coreadmin-api.adoc#coreadmin-api,CoreAdmin API>>*: Tools and commands for core administration using a REST API.
* *<<config-sets.adoc#config-sets,Config Sets>>*: How to use configsets to avoid duplicating effort when defining a new core.
* *<<config-sets.adoc#config-sets,Configsets>>*: How to use configsets to avoid duplicating effort when defining a new core.

View File

@ -517,7 +517,7 @@ params := [ path, '?' ] key, '=', value { '&', key, '=', value } *
Keys and values additionally use www-urlencoded format to avoid meta-characters and non-ascii characters.
The `params` part of the line closely follows a regular Solr params representation on purpose - in many cases
The `params` part of the line closely follows a regular Solr parameter representation on purpose - in many cases
the content of this part of the command is passed directly to the respective collection- or cluster-level API.
==== Scenario Context
@ -544,7 +544,7 @@ load_snapshot path=/tmp/${foo}
----
==== Scenario Commands
The following commands are supported (command names are case insensitive, but params aren't):
The following commands are supported (command names are case insensitive, but parameter names are not):
* `create_cluster numNodes=N[&disableMetricsHistory=false&timeSourcee=simTime:50]` - create a simulated cluster with N nodes
* `load_snapshot (path=/some/path | zkHost=ZK_CONNECT_STRING)` - create a simulated cluster from an autoscaling snapshot or from a live cluster.
@ -557,7 +557,7 @@ The following commands are supported (command names are case insensitive, but pa
* `loop_start [iterations=N]`, `loop_end` - iterate commands enclosed in `loop_start` / `loop_end` N times, or until a loop abort is requested.
* `set_op_delays op1=delayMs1&op2=delayMs2...` - set operation delays for specific collection commands to simulate slow execution.
* `solr_request /admin/handler?httpMethod=POST&stream.body={'json':'body'}&other=params` - execute one of SolrRequest types supported by `SimCloudManager`.
* `run [time=60000]` - run the simulator for some time, allowing background tasks to execute (eg. trigger event processing).
* `run [time=60000]` - run the simulator for some time, allowing background tasks to execute (e.g., trigger event processing).
* `wait_collection collection=test&shards=N&replicas=M[&withInactive=false&requireLeaders=true&wait=90]` - wait until the collection shape matches the criteria or the wait time elapses (in which case an error is thrown).
* `event_listener trigger=triggerName&stage=SUCCEEDED[&beforeAction=foo | &afterAction=bar]` - prepare to listen for a specific trigger event.
* `wait_event trigger=triggerName[&wait=90]` - wait until an event specified in `event_listener` is captured or a wait time elapses (in which cases an error is thrown).
@ -576,7 +576,7 @@ Example scenario testing the behavior of `.autoAddReplicas` trigger:
# standard comment
// java comment
create_cluster numNodes=2 // inline comment
// load autoscaling config from a json string. Notice that the value must be URL-encoded
// load autoscaling config from a JSON string. Notice that the value must be URL-encoded
load_autoscaling json={'cluster-policy'+:+[{'replica'+:+'<3',+'shard'+:+'#EACH',+'collection'+:+'testCollection','node':'#ANY'}]}&defaultWaitFor=10
solr_request /admin/collections?action=CREATE&autoAddReplicas=true&name=testCollection&numShards=2&replicationFactor=2&maxShardsPerNode=2
wait_collection collection=testCollection&shards=2&replicas=2

View File

@ -566,7 +566,7 @@ are not used with low `every` values and an external scheduling process such as
====
== Default Triggers
A fresh installation of SolrCloud always creates some default triggers. If these triggers are missing (eg. they were
A fresh installation of SolrCloud always creates some default triggers. If these triggers are missing (e.g., they were
deleted) they are re-created on any autoscaling configuration change or Overseer restart. These triggers can be
suspended if their functionality somehow interferes with other configuration but they can't be permanently deleted.
@ -598,7 +598,7 @@ Overseer leader such event may not be properly processed by the triggers (which
For this reason a special marker is created in ZooKeeper so that when the next Overseer leader is elected the
triggers will be able to learn about and process these past events.
Triggers don't delete these markers once they are done processing (because several triggers may need them and eg.
Triggers don't delete these markers once they are done processing (because several triggers may need them and e.g.,
scheduled triggers may run at arbitrary times with arbitrary delays) so Solr needs a mechanism to clean up
old markers for such events so that they don't accumulate over time. This trigger action performs the clean-up
- it deletes markers older than the configured time-to-live (by default it's 48 hours).

View File

@ -29,4 +29,4 @@ The following sections cover these topics:
* <<parameter-reference.adoc#parameter-reference,Parameter Reference>>
* <<command-line-utilities.adoc#command-line-utilities,Command Line Utilities>>
* <<solrcloud-with-legacy-configuration-files.adoc#solrcloud-with-legacy-configuration-files,SolrCloud with Legacy Configuration Files>>
* <<configsets-api.adoc#configsets-api,ConfigSets API>>
* <<configsets-api.adoc#configsets-api,Configsets API>>

View File

@ -42,7 +42,7 @@ In this section, we'll cover everything you need to know about using Solr in Sol
** <<parameter-reference.adoc#parameter-reference,Parameter Reference>>
** <<command-line-utilities.adoc#command-line-utilities,Command Line Utilities>>
** <<solrcloud-with-legacy-configuration-files.adoc#solrcloud-with-legacy-configuration-files,SolrCloud with Legacy Configuration Files>>
** <<configsets-api.adoc#configsets-api,ConfigSets API>>
** <<configsets-api.adoc#configsets-api,Configsets API>>
* <<rule-based-replica-placement.adoc#rule-based-replica-placement,Rule-based Replica Placement>>
* <<cross-data-center-replication-cdcr.adoc#cross-data-center-replication-cdcr,Cross Data Center Replication (CDCR)>>
* <<solrcloud-autoscaling.adoc#solrcloud-autoscaling,SolrCloud Autoscaling>>

View File

@ -132,25 +132,25 @@ bq=category:food^10
bq=category:deli^5
----
Using the `bq` parameter in this way is functionally equivilent to combining your `q` and `bq` params into a single larger boolean query, where the (original) `q` param is "mandatory" and the other clauses are optional:
Using the `bq` parameter in this way is functionally equivilent to combining your `q` and `bq` parameters into a single larger boolean query, where the (original) `q` parameter is "mandatory" and the other clauses are optional:
[source,text]
----
q=(+cheese category:food^10 category:deli^5)
----
The only difference between the above examples, is that using the `bq` param allows you to specify these extra clauses independently (ie: as configuration defaults) from the main query.
The only difference between the above examples, is that using the `bq` parameter allows you to specify these extra clauses independently (i.e., as configuration defaults) from the main query.
[TIP]
[[bq-bf-shortcomings]]
.Additive Boosts vs Multiplicative Boosts
====
Generally speaking, using `bq` (or `bf`, below) is considered a poor way to "boost" documents by a secondary query because it has an "Additive" effect on the final score. The overall impact a particular `bq` param will have on a given document can vary a lot depending on the _absolute_ values of the scores from the original query as well as the `bq` query, which in turn depends on the complexity of the original query, and various scoring factors (TF, IDF, average field length, etc.)
Generally speaking, using `bq` (or `bf`, below) is considered a poor way to "boost" documents by a secondary query because it has an "Additive" effect on the final score. The overall impact a particular `bq` parameter will have on a given document can vary a lot depending on the _absolute_ values of the scores from the original query as well as the `bq` query, which in turn depends on the complexity of the original query, and various scoring factors (TF, IDF, average field length, etc.)
"Multiplicative Boosting" is generally considered to be a more predictable method of influcing document score, because it acts as a "scaling factor" -- increasing (or decreasing) the scores of each document by a _relative_ amount.
The <<other-parsers.adoc#boost-query-parser,`{!boost}` QParser>> provides a convinient wrapper for implementing multiplicitive boosting, and the <<the-extended-dismax-query-parser.adoc#extended-dismax-parameters,`{!edismax}` QParser>> offers a `boost` query param short cut for using it.
The <<other-parsers.adoc#boost-query-parser,`{!boost}` QParser>> provides a convinient wrapper for implementing multiplicitive boosting, and the <<the-extended-dismax-query-parser.adoc#extended-dismax-parameters,`{!edismax}` QParser>> offers a `boost` query parameter shortcut for using it.
====
@ -166,7 +166,7 @@ bf=div(1,sum(1,price))^1.5
Specifying functions with the bf parameter is essentially just shorthand for using the `bq` parameter (<<#bq-bf-shortcomings,with the same shortcomings>>) combined with the `{!func}` parser -- with the addition of the simplified "query boost" syntax.
For example, the two `bf` params listed below, are completely equivilent to the two `bq` params below:
For example, the two `bf` parameters listed below, are completely equivalent to the two `bq` parameters below:
[source,text]
----

View File

@ -109,7 +109,7 @@ Solr supports modifying, adding and removing child documents as part of atomic u
Schema and configuration requirements are detailed in
<<updating-parts-of-documents#field-storage, Field Storage>> and <<indexing-nested-documents#schema-configuration, Indexing Nested Documents>>. +
Under the hood, Solr retrieves the whole nested structure, deletes the old documents,
and re-indexes the structure after applying the atomic update. +
and reindexes the structure after applying the atomic update. +
Syntactically, nested/partial updates are very similar to a regular atomic update,
as demonstrated by the examples below.

View File

@ -53,7 +53,7 @@ IMPORTANT: It's a good idea to keep these files under version control.
== Uploading Configuration Files using bin/solr or SolrJ
In production situations, <<config-sets.adoc#config-sets,Config Sets>> can also be uploaded to ZooKeeper independent of collection creation using either Solr's <<solr-control-script-reference.adoc#solr-control-script-reference,Solr Control Script>> or SolrJ.
In production situations, <<config-sets.adoc#config-sets,Configsets>> can also be uploaded to ZooKeeper independent of collection creation using either Solr's <<solr-control-script-reference.adoc#solr-control-script-reference,Solr Control Script>> or SolrJ.
The below command can be used to upload a new configset using the bin/solr script.