Ref Guide: expand abbreviations, standarize some spellings

This commit is contained in:
Cassandra Targett 2019-05-30 14:49:02 -05:00
parent df96a0e1b8
commit b86dd59fe1
17 changed files with 40 additions and 42 deletions

View File

@ -40,7 +40,7 @@ The current list of collections that are members of an alias can be verified via
The full definition of all aliases including metadata about that alias (in the case of routed aliases, see below)
can be verified via the <<collections-api.adoc#listaliases,LISTALIASES>> command.
Alternatively this information is available by checking `/aliases.json` in zookeeper via a zookeeper
Alternatively this information is available by checking `/aliases.json` in ZooKeeper with either the native ZooKeeper
client or in the <<cloud-screens.adoc#tree-view,tree page>> of the cloud menu in the admin UI.
Aliases may be deleted via the <<collections-api.adoc#deletealias,DELETEALIAS>> command.

View File

@ -115,7 +115,7 @@ Note how you can mix single string rules with lists of rules that must all match
* `type:<request-type>` (request-type by name: `ADMIN`, `SEARCH`, `UPDATE`, `STREAMING`, `UNKNOWN`)
* `collection:<collection-name>` (collection by name)
* `user:<userid>` (user by userid)
* `path:</path/to/handler>` (request path relative to `/solr` or for search/update requests relative to collection. Path is prefix matched, i.e. `/admin` will mute any sub path as well.
* `path:</path/to/handler>` (request path relative to `/solr` or for search/update requests relative to collection. Path is prefix matched, i.e., `/admin` will mute any sub path as well.
* `ip:<ip-address>` (IPv4-address)
* `param:<param>=<value>` (request parameter)
@ -139,7 +139,7 @@ Using the `MultiDestinationAuditLogger` you can configure multiple audit logger
----
== Metrics
AuditLoggerPlugins record metrics about count and timing of log requests, as well as queue size for async loggers. The metrics keys are all recorded on the `SECURITY` category, and each metric name are prefixed with a scope of `/auditlogging` and the class name of the logger, e.g. `SolrLogAuditLoggerPlugin`. The individual metrics are:
AuditLoggerPlugins record metrics about count and timing of log requests, as well as queue size for async loggers. The metrics keys are all recorded on the `SECURITY` category, and each metric name are prefixed with a scope of `/auditlogging` and the class name of the logger, e.g., `SolrLogAuditLoggerPlugin`. The individual metrics are:
* `count` (type: meter. Records number and rate of audit logs done)
* `errors` (type: meter. Records number and rate of errors)

View File

@ -54,7 +54,7 @@ There are several things defined in this file:
<2> The parameter `"blockUnknown":true` means that unauthenticated requests are not allowed to pass through.
<3> A user called 'solr', with a password `'SolrRocks'` has been defined.
<4> We override the `realm` property to display another text on the login prompt.
<5> The parameter `"forwardCredentials":false` means we let Solr's PKI authenticaion handle distributed request instead of forwarding the Basic Auth header.
<5> The parameter `"forwardCredentials":false` means we let Solr's PKI authenticaion handle distributed request instead of forwarding the Basic Auth header.
<6> The 'admin' role has been defined, and it has permission to edit security settings.
<7> The 'solr' user has been defined to the 'admin' role.
@ -170,7 +170,7 @@ curl --user solr:SolrRocks http://localhost:8983/api/cluster/security/authentica
====
--
The authentication realm defaults to `solr` and is displayed in the `WWW-Authenticate` HTTP header and in the Admin UI login page. To change the realm, set the `realm` property:
The authentication realm defaults to `solr` and is displayed in the `WWW-Authenticate` HTTP header and in the Admin UI login page. To change the realm, set the `realm` property:
[.dynamic-tabs]
--
@ -225,7 +225,7 @@ Alternatively, users can use SolrJ's `PreemptiveBasicAuthClientBuilderFactory` t
To enable this feature, users should set the following system property `-Dsolr.httpclient.builder.factory=org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory`.
`PreemptiveBasicAuthClientBuilderFactory` allows applications to provide credentials in two different ways:
. The `basicauth` system property can be passed, containing the credentials directly (e.g. `-Dbasicauth=username:password`). This option is straightforward, but may expose the credentials in the command line, depending on how they're set.
. The `basicauth` system property can be passed, containing the credentials directly (e.g., `-Dbasicauth=username:password`). This option is straightforward, but may expose the credentials in the command line, depending on how they're set.
. The `solr.httpclient.config` system property can be passed, containing a path to a properties file holding the credentials. Inside this file the username and password can be specified as `httpBasicAuthUser` and `httpBasicAuthPassword`, respectively.
+
[source,bash]

View File

@ -43,7 +43,7 @@ image::images/cloud-screens/cloud-tree.png[image,width=487,height=250]
As an aid to debugging, the data shown in the "Tree" view can be exported locally using the following command `bin/solr zk ls -r /`
== ZK Status View
The "ZK Status" view gives an overview over the ZooKeeper servers or ensemble used by Solr. It lists whether running in `standalone` or `ensemble` mode, shows how many zookeepers are configured, and then displays a table listing detailed monitoring status for each of the zookeepers, including who is the leader, configuration parameters and more.
The "ZK Status" view gives an overview over the ZooKeeper servers or ensemble used by Solr. It lists whether running in `standalone` or `ensemble` mode, shows how many ZooKeeper nodes are configured, and then displays a table listing detailed monitoring status for each node, including who is the leader, configuration parameters, and more.
image::images/cloud-screens/cloud-zkstatus.png[image,width=512,height=509]

View File

@ -87,7 +87,7 @@ A `false` value makes the results of a collection creation predictable and gives
This parameter is ignored if `createNodeSet` is not also specified.
`collection.configName`::
Defines the name of the configuration (which *must already be stored in ZooKeeper*) to use for this collection. If not provided, Solr will use the configuration of `_default` configSet to create a new (and mutable) configSet named `<collectionName>.AUTOCREATED` and will use it for the new collection. When such a collection (that uses a copy of the _default configset) is deleted, the autocreated configset is not deleted by default.
Defines the name of the configuration (which *must already be stored in ZooKeeper*) to use for this collection. If not provided, Solr will use the configuration of `_default` configset to create a new (and mutable) configset named `<collectionName>.AUTOCREATED` and will use it for the new collection. When such a collection (that uses a copy of the _default configset) is deleted, the autocreated configset is not deleted by default.
`router.field`::
If this parameter is specified, the router will look at the value of the field in an input document to compute the hash and identify a shard instead of looking at the `uniqueKey` field. If the field specified is null in the document, the document will be rejected.
@ -1693,7 +1693,6 @@ http://localhost:8983/solr/admin/collections?action=COLSTATUS&collection=getting
"total": 49153,
"postings [PerFieldPostings(segment=_i formats=1)]": {
"total": 31023,
...
"fields": {
"dc": {
"flags": "I-----------",
@ -1713,11 +1712,10 @@ http://localhost:8983/solr/admin/collections?action=COLSTATUS&collection=getting
"dc.date": {
"flags": "-Dsrn-------:1:1:8",
"schemaType": "pdates"
},
...
}
}}}}}}}}}}}
----
[[migrate]]
== MIGRATE: Migrate Documents to Another Collection
@ -2651,7 +2649,7 @@ Backs up Solr collections and associated configurations to a shared filesystem -
`/admin/collections?action=BACKUP&name=myBackupName&collection=myCollectionName&location=/path/to/my/shared/drive`
The BACKUP command will backup Solr indexes and configurations for a specified collection. The BACKUP command takes one copy from each shard for the indexes. For configurations, it backs up the configSet that was associated with the collection and metadata.
The BACKUP command will backup Solr indexes and configurations for a specified collection. The BACKUP command takes one copy from each shard for the indexes. For configurations, it backs up the configset that was associated with the collection and metadata.
=== BACKUP Parameters
@ -2678,7 +2676,7 @@ The RESTORE operation will create a collection with the specified name in the co
The collection created will be have the same number of shards and replicas as the original collection, preserving routing information, etc. Optionally, you can override some parameters documented below.
While restoring, if a configSet with the same name exists in ZooKeeper then Solr will reuse that, or else it will upload the backed up configSet in ZooKeeper and use that.
While restoring, if a configset with the same name exists in ZooKeeper then Solr will reuse that, or else it will upload the backed up configset in ZooKeeper and use that.
You can use the collection <<createalias,CREATEALIAS>> command to make sure clients don't need to change the endpoint to query or index against the newly restored collection.

View File

@ -58,9 +58,9 @@ Note that this command is the only one of the Core Admin API commands that *does
====
Your CREATE call must be able to find a configuration, or it will not succeed.
When you are running SolrCloud and create a new core for a collection, the configuration will be inherited from the collection. Each collection is linked to a configName, which is stored in ZooKeeper. This satisfies the config requirement. There is something to note, though if you're running SolrCloud, you should *NOT* be using the CoreAdmin API at all. Use the <<collections-api.adoc#collections-api,Collections API>>.
When you are running SolrCloud and create a new core for a collection, the configuration will be inherited from the collection. Each collection is linked to a configName, which is stored in ZooKeeper. This satisfies the configuration requirement. There is something to note, though: if you're running SolrCloud, you should *NOT* use the CoreAdmin API at all. Use the <<collections-api.adoc#collections-api,Collections API>>.
When you are not running SolrCloud, if you have <<config-sets.adoc#config-sets,Config Sets>> defined, you can use the configSet parameter as documented below. If there are no configsets, then the `instanceDir` specified in the CREATE call must already exist, and it must contain a `conf` directory which in turn must contain `solrconfig.xml`, your schema (usually named either `managed-schema` or `schema.xml`), and any files referenced by those configs.
When you are not running SolrCloud, if you have <<config-sets.adoc#config-sets,Config Sets>> defined, you can use the `configSet` parameter as documented below. If there are no configsets, then the `instanceDir` specified in the CREATE call must already exist, and it must contain a `conf` directory which in turn must contain `solrconfig.xml`, your schema (usually named either `managed-schema` or `schema.xml`), and any files referenced by those configs.
The config and schema filenames can be specified with the `config` and `schema` parameters, but these are expert options. One thing you could do to avoid creating the `conf` directory is use `config` and `schema` parameters that point at absolute paths, but this can lead to confusing configurations unless you fully understand what you are doing.
====

View File

@ -61,7 +61,7 @@ The Document Builder provides a wizard-like interface to enter fields of a docum
The File Upload option allows choosing a prepared file and uploading it. If using `/update` for the Request-Handler option, you will be limited to XML, CSV, and JSON.
Other document types (e.g Word, PDF, etc.) can be indexed using the ExtractingRequestHandler (aka, Solr Cell). You must modify the RequestHandler to `/update/extract`, which must be defined in your `solrconfig.xml` file with your desired defaults. You should also add `&literal.id` shown in the "Extracting Request Handler Params" field so the file chosen is given a unique id.
Other document types (e.g., Word, PDF, etc.) can be indexed using the ExtractingRequestHandler (aka, Solr Cell). You must modify the RequestHandler to `/update/extract`, which must be defined in your `solrconfig.xml` file with your desired defaults. You should also add `&literal.id` shown in the "Extracting Request Handler Params" field so the file chosen is given a unique id.
More information can be found at: <<uploading-data-with-solr-cell-using-apache-tika.adoc#uploading-data-with-solr-cell-using-apache-tika,Uploading Data with Solr Cell using Apache Tika>>
== Solr Command

View File

@ -29,7 +29,7 @@ You can find `solr.xml` in your `$SOLR_HOME` directory (usually `server/solr`) i
<solr>
<int name="maxBooleanClauses">${solr.max.booleanClauses:1024}</int>
<solrcloud>
<str name="host">${host:}</str>
<int name="hostPort">${jetty.port:8983}</int>
@ -92,7 +92,7 @@ This attribute, when set to `true`, ensures that the multiple cores pointing to
Defines how many cores with `transient=true` that can be loaded before swapping the least recently used core for a new core.
`configSetBaseDir`::
The directory under which configSets for Solr cores can be found. Defaults to `$SOLR_HOME/configsets`.
The directory under which configsets for Solr cores can be found. Defaults to `$SOLR_HOME/configsets`.
[[global-maxbooleanclauses]]
`maxBooleanClauses`::

View File

@ -56,7 +56,7 @@ With the exception of in-place updates, the whole block must be updated or delet
=== Rudimentary Root-only Schemas
These schemas do not contain any other nested related fields apart from `\_root_`.
Many schemas in existence are this way simply because default configSets are this way, even if the application isn't using nested documents.
Many schemas in existence are this way simply because default configsets are this way, even if the application isn't using nested documents.
If an application uses nested documents with such a schema, keep in mind that that some related features aren't as effective since there is less information. Mainly the <<searching-nested-documents.adoc#child-doc-transformer,[child]>> transformer returns matching children in a flat list (not nested) and it's attached to the parent using the special field name `\_childDocuments_`.
With such a schema, typically you should have a field that differentiates a root doc from any nested children.
@ -150,4 +150,4 @@ Do *not* add a root document that has the same ID of a child document. _This wi
To delete a nested document, you can delete it by the ID of the root document.
If you try to use an ID of a child document, nothing will happen since only root document IDs are considered.
If you use Solr's delete-by-query APIs, you *have to be careful* to ensure that no children remain of any documents that are being deleted. _Doing otherwise will violate integrity assumptions that Solr expects._
If you use Solr's delete-by-query APIs, you *have to be careful* to ensure that no children remain of any documents that are being deleted. _Doing otherwise will violate integrity assumptions that Solr expects._

View File

@ -23,11 +23,11 @@ Queries and filters provided in JSON requests can be specified using a rich, pow
== Query DSL Structure
The JSON Request API accepts query values in three different formats:
* A valid <<the-standard-query-parser.adoc#the-standard-query-parser,query string>> that uses the default `deftype` (`lucene`, in most cases). e.g. `title:solr`.
* A valid <<the-standard-query-parser.adoc#the-standard-query-parser,query string>> that uses the default `deftype` (`lucene`, in most cases). e.g., `title:solr`.
* A valid <<local-parameters-in-queries.adoc#local-parameters-in-queries,local parameters query string>> that specifies its `deftype` explicitly. e.g. `{!dismax qf=title}solr`.
* A valid <<local-parameters-in-queries.adoc#local-parameters-in-queries,local parameters query string>> that specifies its `deftype` explicitly. e.g., `{!dismax qf=title}solr`.
* A valid JSON object with the name of the query parser and any relevant parameters. e.g. `{ "lucene": {"df":"title", "query":"solr"}}`.
* A valid JSON object with the name of the query parser and any relevant parameters. e.g., `{ "lucene": {"df":"title", "query":"solr"}}`.
** The top level "query" JSON block generally only has a single property representing the name of the query parser to use. The value for the query parser property is a child block containing any relevant parameters as JSON properties. The whole structure is analogous to a "local-params" query string. The query itself (often represented in local params using the name `v`) is specified with the key `query` instead.
All of these syntaxes can be used to specify queries for either the JSON Request API's `query` or `filter` properties.

View File

@ -281,7 +281,7 @@ Solr uses the *Noggit* JSON parser in its request API. Noggit is capable of mor
* Multi-line ("C style") comments can be inserted using `/\*` and `*/`
* strings can be single-quoted
* special characters can be backslash-escaped
* trailing (extra) commas are silently ignored (e.g. `[9,4,3,]`)
* trailing (extra) commas are silently ignored (e.g., `[9,4,3,]`)
* nbsp (non-break space, \u00a0) is treated as whitespace.
== Debugging

View File

@ -58,7 +58,7 @@ principalClaim ; What claim id to pull principal from ;
claimsMatch ; JSON object of claims (key) that must match a regular expression (value). Example: `{ "foo" : "A|B" }` will require the `foo` claim to be either "A" or "B". ; (none)
adminUiScope ; Define what scope is requested when logging in from Admin UI ; If not defined, the first scope from `scope` parameter is used
authorizationEndpoint; The URL for the Id Provider's authorization endpoint ; Auto configured if `wellKnownUrl` is provided
redirectUris ; Valid location(s) for redirect after external authentication. Takes a string or array of strings. Must be the base URL of Solr, e.g. https://solr1.example.com:8983/solr/ and must match the list of redirect URIs registered with the Identity Provider beforehand. ; Defaults to empty list, i.e. any node is assumed to be a valid redirect target.
redirectUris ; Valid location(s) for redirect after external authentication. Takes a string or array of strings. Must be the base URL of Solr, e.g., https://solr1.example.com:8983/solr/ and must match the list of redirect URIs registered with the Identity Provider beforehand. ; Defaults to empty list, i.e., any node is assumed to be a valid redirect target.
|===
== More Configuration Examples

View File

@ -68,12 +68,12 @@ See the section <<solrcloud-autoscaling.adoc#solrcloud-autoscaling,SolrCloud Aut
== Configuration and Default Changes
=== New Default ConfigSet
Several changes have been made to configSets that ship with Solr; not only their content but how Solr behaves in regard to them:
Several changes have been made to configsets that ship with Solr; not only their content but how Solr behaves in regard to them:
* The `data_driven_configset` and `basic_configset` have been removed, and replaced by the `_default` configset. The `sample_techproducts_configset` also remains, and is designed for use with the example documents shipped with Solr in the `example/exampledocs` directory.
* When creating a new collection, if you do not specify a configSet, the `_default` will be used.
** If you use SolrCloud, the `_default` configSet will be automatically uploaded to ZooKeeper.
** If you use standalone mode, the instanceDir will be created automatically, using the `_default` configSet as it's basis.
* When creating a new collection, if you do not specify a configset, the `_default` will be used.
** If you use SolrCloud, the `_default` configset will be automatically uploaded to ZooKeeper.
** If you use standalone mode, the instanceDir will be created automatically, using the `_default` configset as it's basis.
=== Schemaless Improvements

View File

@ -132,7 +132,7 @@ Both these actions were and still are problematic. In-place-updates are safe tho
If you want to delete certain child documents and if you know they don't themselves have nested children
then you must do so with a delete-by-query technique.
* Solr has a new field in the `\_default` configSet, called `_nest_path_`. This field stores the path of the document
* Solr has a new field in the `\_default` configset, called `_nest_path_`. This field stores the path of the document
in the hierarchy for non-root documents.
See the sections <<indexing-nested-documents.adoc#indexing-nested-documents,Indexing Nested Documents>> and

View File

@ -57,7 +57,7 @@ This registry is returned at `solr.jvm` and includes the following information.
This registry is returned at `solr.node` and includes the following information. When making requests with the <<Metrics API>>, you can specify `&group=node` to limit to only these metrics.
* handler requests (count, timing): collections, info, admin, configSets, etc.
* handler requests (count, timing): collections, info, admin, configsets, etc.
* number of cores (loaded, lazy, unloaded)
=== Core (SolrCore) Registry

View File

@ -147,9 +147,9 @@ Please choose a configuration for the techproducts collection, available options
_default or sample_techproducts_configs [_default]
----
We've reached another point where we will deviate from the default option. Solr has two sample sets of configuration files (called a _configSet_) available out-of-the-box.
We've reached another point where we will deviate from the default option. Solr has two sample sets of configuration files (called a configset) available out-of-the-box.
A collection must have a configSet, which at a minimum includes the two main configuration files for Solr: the schema file (named either `managed-schema` or `schema.xml`), and `solrconfig.xml`. The question here is which configSet you would like to start with. The `_default` is a bare-bones option, but note there's one whose name includes "techproducts", the same as we named our collection. This configSet is specifically designed to support the sample data we want to use, so enter `sample_techproducts_configs` at the prompt and hit kbd:[enter].
A collection must have a configset, which at a minimum includes the two main configuration files for Solr: the schema file (named either `managed-schema` or `schema.xml`), and `solrconfig.xml`. The question here is which configset you would like to start with. The `_default` is a bare-bones option, but note there's one whose name includes "techproducts", the same as we named our collection. This configset is specifically designed to support the sample data we want to use, so enter `sample_techproducts_configs` at the prompt and hit kbd:[enter].
At this point, Solr will create the collection and again output to the screen the commands it issues.
@ -529,13 +529,13 @@ Solr's schema is a single file (in XML) that stores the details about the fields
Earlier in the tutorial we mentioned copy fields, which are fields made up of data that originated from other fields. You can also define dynamic fields, which use wildcards (such as `*_t` or `*_s`) to dynamically create fields of a specific field type. These types of rules are also defined in the schema.
****
When you initially started Solr in the first exercise, we had a choice of a configSet to use. The one we chose had a schema that was pre-defined for the data we later indexed. This time, we're going to use a configSet that has a very minimal schema and let Solr figure out from the data what fields to add.
When you initially started Solr in the first exercise, we had a choice of a configset to use. The one we chose had a schema that was pre-defined for the data we later indexed. This time, we're going to use a configset that has a very minimal schema and let Solr figure out from the data what fields to add.
The data you're going to index is related to movies, so start by creating a collection named "films" that uses the `_default` configSet:
The data you're going to index is related to movies, so start by creating a collection named "films" that uses the `_default` configset:
`bin/solr create -c films -s 2 -rf 2`
Whoa, wait. We didn't specify a configSet! That's fine, the `_default` is appropriately named, since it's the default and is used if you don't specify one at all.
Whoa, wait. We didn't specify a configset! That's fine, the `_default` is appropriately named, since it's the default and is used if you don't specify one at all.
We did, however, set two parameters `-s` and `-rf`. Those are the number of shards to split the collection across (2) and how many replicas to create (2). This is equivalent to the options we had during the interactive example from the first exercise.
@ -573,13 +573,13 @@ http://localhost:7574/solr/admin/collections?action=CREATE&name=films&numShards=
"core":"films_shard1_replica_n2"}}}
----
The first thing the command printed was a warning about not using this configSet in production. That's due to some of the limitations we'll cover shortly.
The first thing the command printed was a warning about not using this configset in production. That's due to some of the limitations we'll cover shortly.
Otherwise, though, the collection should be created. If we go to the Admin UI at http://localhost:8983/solr/#/films/collection-overview we should see the overview screen.
==== Preparing Schemaless for the Films Data
There are two parallel things happening with the schema that comes with the `_default` configSet.
There are two parallel things happening with the schema that comes with the `_default` configset.
First, we are using a "managed schema", which is configured to only be modified by Solr's Schema API. That means we should not hand-edit it so there isn't confusion about which edits come from which source. Solr's Schema API allows us to make changes to fields, field types, and other types of schema rules.
@ -896,7 +896,7 @@ Before you get started, create a new collection, named whatever you'd like. In t
`./bin/solr create -c localDocs -s 2 -rf 2`
Again, as we saw from Exercise 2 above, this will use the `_default` configSet and all the schemaless features it provides. As we noted previously, this may cause problems when we index our data. You may need to iterate on indexing a few times before you get the schema right.
Again, as we saw from Exercise 2 above, this will use the `_default` configset and all the schemaless features it provides. As we noted previously, this may cause problems when we index our data. You may need to iterate on indexing a few times before you get the schema right.
=== Indexing Ideas

View File

@ -115,9 +115,9 @@ as demonstrated by the examples below.
[NOTE]
====
.\_route_ Param
.\_route_ Parameter
To ensure each nested update is routed to its respective shard,
`\_route_` param must be set to the root document's ID when the
`\_route_` parameter must be set to the root document's ID when the
update does not have that root document.
====