Ref Guide: standardize i.e., e.g., spellings; fix typos

This commit is contained in:
Cassandra Targett 2020-03-02 11:52:52 -06:00
parent 043c5dff6f
commit 153d7bcfee
13 changed files with 69 additions and 41 deletions

View File

@ -141,7 +141,7 @@ Step 2: Get the `sha512` hash of the jar
openssl dgst -sha512 runtimelibs.jar openssl dgst -sha512 runtimelibs.jar
---- ----
Step 3 : Start solr with runtime lib enabled Step 3 : Start Solr with runtime lib enabled
[source,bash] [source,bash]
---- ----

View File

@ -155,27 +155,32 @@ Unlike the CLUSTERPROP command on the <<cluster-node-management.adoc#clusterprop
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:2181 -cmd clusterprop -name urlScheme -val https ./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:2181 -cmd clusterprop -name urlScheme -val https
---- ----
=== Export data from a collection to a file === Export Data from a Collection to a File
This command downloads documents from all shards in parallel and write the documents to a single file. The supported format are `jsonl` and `javabin` This command downloads documents from all shards in parallel and write the documents to a single file. The supported format are `jsonl` and `javabin`.
Arguments are: Arguments are:
`-url` :: (Requred parameter) Url of the collection `-url`:: (Required) The URL of the collection.
`-out` :: (Optional) Name of the file to write to. default file name is `<collection-name>.json` . If the file name ends with `.json.gz` , the output is a zip file of json
`-format` :: (Optional) Supported values are json/javabin `-out`:: (Optional) Name of the file to write to. default file name is `<collection-name>.json`. If the file name ends with `.json.gz` , the output is a zip file of JSON.
`-limit` :: (Optional) No:of docs to export. By default the entire collection is exported
`-fields` :: (Optional) Fields to be exported. By default, all fields are exported `-format`:: (Optional) Supported values are `json` or `javabin`.
`-limit`:: (Optional) No:of docs to export. By default the entire collection is exported.
`-fields`:: (Optional) Fields to be exported. By default, all fields are exported.
Example 1: Export all documents in a collection `gettingstarted` into a file called `gettingstarted.json`:
example 1: Export all documents in a collection `gettingstarted` into a file called `gettingstarted.json`
[source,bash] [source,bash]
---- ----
bin/solr export -url http://localhost:8983/solr/gettingstarted bin/solr export -url http://localhost:8983/solr/gettingstarted
---- ----
example 2: export 1M docs of collection `gettingstarted` into a file called `1MDocs.json.gz` as a zipped json file Example 2: export 1M docs of collection `gettingstarted` into a file called `1MDocs.json.gz` as a zipped JSON file:
[source,bash] [source,bash]
---- ----
bin/solr export -url http://localhost:8983/solr/gettingstarted -out 1MDocs.json.gz bin/solr export -url http://localhost:8983/solr/gettingstarted -out 1MDocs.json.gz
---- ----

View File

@ -16,7 +16,7 @@
// specific language governing permissions and limitations // specific language governing permissions and limitations
// under the License. // under the License.
Configsets are a set of configuration files used in a Solr installation: `solrconfig.xml`, the schema, and then <<resource-loading.adoc#resource-loading,resources>> like language files, `synonyms.txt`, DIH-related configuration, and others that are referenced from the config or schema. Configsets are a set of configuration files used in a Solr installation: `solrconfig.xml`, the schema, and then <<resource-loading.adoc#resource-loading,resources>> like language files, `synonyms.txt`, DIH-related configuration, and others.
Such configuration, _configsets_, can be named and then referenced by collections or cores, possibly with the intent to share them to avoid duplication. Such configuration, _configsets_, can be named and then referenced by collections or cores, possibly with the intent to share them to avoid duplication.
@ -26,7 +26,7 @@ Solr ships with two example configsets located in `server/solr/configsets`, whic
If you are using Solr in standalone mode, configsets are managed on the filesystem. If you are using Solr in standalone mode, configsets are managed on the filesystem.
Each Solr core can have it's very own configSet located beneath it in a `<instance_dir>/conf/` dir. Each Solr core can have it's very own configset located beneath it in a `<instance_dir>/conf/` dir.
Here, it is not named or shared and the word _configset_ isn't found. Here, it is not named or shared and the word _configset_ isn't found.
In Solr's early years, this was _the only way_ it was configured. In Solr's early years, this was _the only way_ it was configured.

View File

@ -47,8 +47,8 @@ A SolrCloud cluster consists of some "logical" concepts layered on top of some "
** The level of redundancy built into the Collection and how fault tolerant the Cluster can be in the event that some Nodes become unavailable. ** The level of redundancy built into the Collection and how fault tolerant the Cluster can be in the event that some Nodes become unavailable.
** The theoretical limit in the number concurrent search requests that can be processed under heavy load. ** The theoretical limit in the number concurrent search requests that can be processed under heavy load.
WARNING: Make sure the DNS resolution in your cluster is stable, ie. WARNING: Make sure the DNS resolution in your cluster is stable, i.e.,
for each live host belonging to a Cluster the host name always corresponds to the for each live host belonging to a Cluster the host name always corresponds to the
same specific IP and physical node. For example, in clusters deployed on AWS this would same specific IP and physical node. For example, in clusters deployed on AWS this would
require setting `preserve_hostname: true` in `/etc/cloud/cloud.cfg`. Changing DNS resolution require setting `preserve_hostname: true` in `/etc/cloud/cloud.cfg`. Changing DNS resolution
of live nodes may lead to unexpected errors. See SOLR-13159 for more details. of live nodes may lead to unexpected errors. See SOLR-13159 for more details.

View File

@ -341,12 +341,12 @@ include::{example-source-dir}JsonRequestApiTest.java[tag=solrj-ipod-query-bool-f
== Additional Queries == Additional Queries
Multiple additional queries might be specified under `queries` key with all syntax alternatives described above. Every entry might have multiple values in array. Notice that old-style referencing `"{!v=$query_name}"` picks only the first element in array ignoring everything beyond, e.g. if one changes the reference below from `"{!v=$electronic}"` to `"{!v=$manufacturers}"` it's equivalent to querying for `manu:apple`, ignoring the later query. These queries don't impact query result until explicit referencing. Multiple additional queries might be specified under `queries` key with all syntax alternatives described above. Every entry might have multiple values in array. Notice that old-style referencing `"{!v=$query_name}"` picks only the first element in array ignoring everything beyond, e.g., if one changes the reference below from `"{!v=$electronic}"` to `"{!v=$manufacturers}"` it's equivalent to querying for `manu:apple`, ignoring the later query. These queries don't impact query result until explicit referencing.
[source,bash] [source,bash]
---- ----
curl -X POST http://localhost:8983/solr/techproducts/query -d ' curl -X POST http://localhost:8983/solr/techproducts/query -d '
{ {
"queries": { "queries": {
"electronic": {"field": {"f":"cat", "query":"electronics"}}, "electronic": {"field": {"f":"cat", "query":"electronics"}},
"manufacturers": [ "manufacturers": [

View File

@ -40,7 +40,7 @@ Certain plugins or add-ons to plugins require placement here.
They will document themselves to say so. They will document themselves to say so.
Solr incorporates Jetty for providing HTTP server functionality. Solr incorporates Jetty for providing HTTP server functionality.
Jetty has some directories that contain `.jar` files for itself and its own plugins / modules or JVM level plugins (e.g. loggers). Jetty has some directories that contain `.jar` files for itself and its own plugins / modules or JVM level plugins (e.g., loggers).
Solr plugins won't work in these locations. Solr plugins won't work in these locations.
== Lib Directives in SolrConfig == Lib Directives in SolrConfig

View File

@ -192,7 +192,7 @@ This syntax has been removed entirely and if sent to Solr it will now produce an
The pattern language is very similar but not the same. The pattern language is very similar but not the same.
Typically, simply update the pattern by changing an uppercase 'Z' to lowercase 'z' and that's it. Typically, simply update the pattern by changing an uppercase 'Z' to lowercase 'z' and that's it.
+ +
For the current recommended set of patterns in schemaless mode, see the section <<schemaless-mode.adoc#schemaless-mode,Schemaless Mode>>, or simply examine the `_default` configSet (found in `server/solr/configsets`). For the current recommended set of patterns in schemaless mode, see the section <<schemaless-mode.adoc#schemaless-mode,Schemaless Mode>>, or simply examine the `_default` configset (found in `server/solr/configsets`).
+ +
Also note that the default set of date patterns (formats) have expanded from previous releases to subsume those patterns previously handled by the "extract" contrib (Solr Cell / Tika). Also note that the default set of date patterns (formats) have expanded from previous releases to subsume those patterns previously handled by the "extract" contrib (Solr Cell / Tika).

View File

@ -1117,21 +1117,39 @@ An optional parameter used to determine which of several query implementations s
---- ----
== XCJF Query Parser == XCJF Query Parser
The Cross Collection Join filter is a query parser plugin that will execute a query against a remote Solr collection to get back a set of join keys that will be used to as a filter query against the local Solr collection. The XCJF query parser will create an XCJFQuery object. The XCJFQuery will first query a remote solr collection and get back a streaming expression result of the join keys. As the join keys are streamed to the node, a bitset of the matching documents in the local index is built up. This avoids keeping the full set of join keys in memory at any given time. This bitset is then inserted into the filter cache upon successful execution as with the normal behavior of the solr filter cache. The Cross Collection Join Filter (XCJF) is a query parser plugin that will execute a query against a remote Solr collection to get back a set of join keys that will be used to as a filter query against the local Solr collection.
If the local index is sharded according to the join key field, the XCJF query can leverage a secondary query parser called the "hash_range" query parser. The hash_range query parser is responsible for returning only the documents that hash to a given range of values. This allows the XCJFQuery to query the remote solr collection and return only the join keys that would match a specific shard in the local solr collection. This has the benefit of making sure that network traffic doesn't increase as the number of shards increases and allows for much greater scalability. The XCJF parser will create an XCJFQuery object.
The XCJFQuery will first query a remote Solr collection and get back a streaming expression result of the join keys.
As the join keys are streamed to the node, a bitset of the matching documents in the local index is built up.
This avoids keeping the full set of join keys in memory at any given time.
This bitset is then inserted into the filter cache upon successful execution as with the normal behavior of the Solr filter cache.
XCJF parser works with both String and Point types of fields. The fields that are being used for the join key must be single value and have docValues enabled. It's advised to shard the local collection by the join key as this allows for the optimization mentioned above to be utilized. The XCJF should not be generally used as part of the "q", but rather it is designed to be used as a filter query "fq" parameter to ensure proper caching. The remote solr collection that is being queried should have a single value field for the join key with docValues enabled. The remote solr collection does not have any specific sharding requirements. If the local index is sharded according to the join key field, the XCJF parser can leverage a secondary query parser called the "hash_range" query parser.
The hash_range query parser is responsible for returning only the documents that hash to a given range of values.
This allows the XCJFQuery to query the remote Solr collection and return only the join keys that would match a specific shard in the local Solr collection.
This has the benefit of making sure that network traffic doesn't increase as the number of shards increases and allows for much greater scalability.
=== XCJF Query Parser definition in solrconfig.xml The XCJF parser works with both String and Point types of fields.
The fields that are being used for the join key must be single-valued and have docValues enabled.
The XCJF has some configuration options that can be specified in the solrconfig.xml It's advised to shard the local collection by the join key as this allows for the optimization mentioned above to be utilized.
The XCJF parser should not be generally used as part of the `q` parameter, but rather it is designed to be used as a filter query (`fq` parameter) to ensure proper caching.
The remote Solr collection that is being queried should have a single-valued field for the join key with docValues enabled.
The remote Solr collection does not have any specific sharding requirements.
=== XCJF Query Parser Definition in solrconfig.xml
The XCJF has some configuration options that can be specified in `solrconfig.xml`.
`routerField`:: `routerField`::
If the documents are routed to shards using the CompositeID router by the join field, then that field name should be specified in the configuration here. This will allow the parser to optimize the resulting HashRange query. If the documents are routed to shards using the CompositeID router by the join field, then that field name should be specified in the configuration here. This will allow the parser to optimize the resulting HashRange query.
`solrUrl`:: `solrUrl`::
If specified, this array of strings specifies the white listed Solr URLs that you can pass to the solrUrl query parameter. Without this configuration the solrUrl parameter cannot be used. This restriction is necessary to prevent an attacker from using solr to explore the network. If specified, this array of strings specifies the white listed Solr URLs that you can pass to the solrUrl query parameter. Without this configuration the solrUrl parameter cannot be used. This restriction is necessary to prevent an attacker from using Solr to explore the network.
[source,xml] [source,xml]
---- ----
@ -1148,31 +1166,36 @@ If specified, this array of strings specifies the white listed Solr URLs that yo
=== XCJF Query Parameters === XCJF Query Parameters
`collection`:: `collection`::
The name of the external Solr collection to be queried to retrieve the set of join key values ( required ) The name of the external Solr collection to be queried to retrieve the set of join key values (required).
`zkHost`:: `zkHost`::
The connection string to be used to connect to Zookeeper. zkHost and solrUrl are both optional parameters, and at most one of them should be specified. If neither of zkHost or solrUrl are specified, the local Zookeeper cluster will be used. ( optional ) The connection string to be used to connect to ZooKeeper. `zkHost` and `solrUrl` are both optional parameters, and at most one of them should be specified. If neither `zkHost` nor `solrUrl` are specified, the local ZooKeeper cluster will be used. (optional).
`solrUrl`:: `solrUrl`::
The URL of the external Solr node to be queried. Must be a character for character exact match of a whitelisted url. ( optional, disabled by default for security ) The URL of the external Solr node to be queried. Must be a character for character exact match of a whitelisted url. (optional, disabled by default for security).
`from`:: `from`::
The join key field name in the external collection ( required ) The join key field name in the external collection (required).
`to`:: `to`::
The join key field name in the local collection The join key field name in the local collection.
`v`:: `v`::
The query substituted in as a local param. This is the query string that will match documents in the remote collection. The query substituted in as a local param. This is the query string that will match documents in the remote collection.
`routed`:: `routed`::
true / false. If true, the XCJF query will use each shard's hash range to determine the set of join keys to retrieve for that shard. This parameter improves the performance of the cross-collection join, but it depends on the local collection being routed by the toField. If this parameter is not specified, the XCJF query will try to determine the correct value automatically. If `true`, the XCJF query will use each shard's hash range to determine the set of join keys to retrieve for that shard.
This parameter improves the performance of the cross-collection join, but it depends on the local collection being routed by the `to` field.
If this parameter is not specified, the XCJF query will try to determine the correct value automatically.
`ttl`:: `ttl`::
The length of time that an XCJF query in the cache will be considered valid, in seconds. Defaults to 3600 (one hour). The XCJF query will not be aware of changes to the remote collection, so if the remote collection is updated, cached XCJF queries may give inaccurate results. After the ttl period has expired, the XCJF query will re-execute the join against the remote collection. The length of time that an XCJF query in the cache will be considered valid, in seconds.
Defaults to `3600` (one hour).
The XCJF query will not be aware of changes to the remote collection, so if the remote collection is updated, cached XCJF queries may give inaccurate results.
After the `ttl` period has expired, the XCJF query will re-execute the join against the remote collection.
`All others` Other Parameters::
Any normal Solr parameter can also be specified/passed through as a local param. Any normal Solr query parameter can also be specified/passed through as a local param.
=== XCJF Query Examples === XCJF Query Examples

View File

@ -196,7 +196,7 @@ The parameter sets can be used directly in a request handler definition as follo
To summarize, parameters are applied in this order: To summarize, parameters are applied in this order:
* parameters defined in `<invariants>` in `solrconfig.xml`. * parameters defined in `<invariants>` in `solrconfig.xml`.
* parameters applied in `invariants` in `params.json` and that is specified in the requesthandler definition or even in request * parameters applied in `invariants` in `params.json` and are specified in the request handler definition or even in a single request.
* parameters defined in the request directly. * parameters defined in the request directly.
* parameter sets defined in the request, in the order they have been listed with `useParams`. * parameter sets defined in the request, in the order they have been listed with `useParams`.
* parameter sets defined in `params.json` that have been defined in the request handler. * parameter sets defined in `params.json` that have been defined in the request handler.

View File

@ -37,7 +37,7 @@ Prefer to put resources here.
== Resources in Other Places == Resources in Other Places
Resources can also be placed in an arbitrary directory and <<libs.adoc#lib-directives-in-solrconfig,referenced>> from a `<lib />` directive in `solrconfig.xml`, provided the directive refers to a directory and not the actual resource file. Example: `<lib path="/volume/models/" />` Resources can also be placed in an arbitrary directory and <<libs.adoc#lib-directives-in-solrconfig,referenced>> from a `<lib />` directive in `solrconfig.xml`, provided the directive refers to a directory and not the actual resource file. Example: `<lib path="/volume/models/" />`
This choice may make sense if the resource is too large for a configSet in ZooKeeper. This choice may make sense if the resource is too large for a configset in ZooKeeper.
However it's up to you to somehow ensure that all nodes in your cluster have access to these resources. However it's up to you to somehow ensure that all nodes in your cluster have access to these resources.
Finally, and this is very unusual, resources can also be packaged inside `.jar` files from which they will be referenced. Finally, and this is very unusual, resources can also be packaged inside `.jar` files from which they will be referenced.

View File

@ -22,7 +22,7 @@ supported out of the box.
A sampled distributed tracing query request on Jaeger looks like this: A sampled distributed tracing query request on Jaeger looks like this:
.Tracing of a solr query .Tracing of a Solr query
image::images/solr-tracing/query-request-tracing.png[image,width=600] image::images/solr-tracing/query-request-tracing.png[image,width=600]
== Setup Tracer == Setup Tracer

View File

@ -128,8 +128,8 @@ on a convenient organization of the index, and should only be considered if norm
Streaming Expressions respect the <<distributed-requests.adoc#shards-preference-parameter,shards.preference parameter>> for any call to Solr. Streaming Expressions respect the <<distributed-requests.adoc#shards-preference-parameter,shards.preference parameter>> for any call to Solr.
The value of `shards.preference` that is used to route requests is determined in the following order. The first option available is used. The value of `shards.preference` that is used to route requests is determined in the following order. The first option available is used.
- Provided as a parameter in the streaming expression (e.g. `search(...., shards.preference="replica.type:PULL")`) - Provided as a parameter in the streaming expression (e.g., `search(...., shards.preference="replica.type:PULL")`)
- Provided in the URL Params of the streaming expression (e.g. `http://solr_url:8983/solr/stream?expr=....&shards.preference=replica.type:PULL`) - Provided in the URL Params of the streaming expression (e.g., `http://solr_url:8983/solr/stream?expr=....&shards.preference=replica.type:PULL`)
- Set as a default in the Cluster properties. - Set as a default in the Cluster properties.
=== Adding Custom Expressions === Adding Custom Expressions

View File

@ -153,7 +153,7 @@ To query for a field existing, simply use a wildcard instead of a term in the se
`field:*` `field:*`
A field will be considered to "exist" if it has any value, even values which are often considered "not existent". (e.g. `NaN`, `""`, etc.) A field will be considered to "exist" if it has any value, even values which are often considered "not existent". (e.g., `NaN`, `""`, etc.)
=== Range Searches === Range Searches
@ -354,7 +354,7 @@ Solr's standard query parser originated as a variation of Lucene's "classic" Que
** `field:[* TO 100]` finds all field values less than or equal to 100 ** `field:[* TO 100]` finds all field values less than or equal to 100
** `field:[100 TO *]` finds all field values greater than or equal to 100 ** `field:[100 TO *]` finds all field values greater than or equal to 100
** `field:[* TO *]` finds all documents where the field has a value between `-Infinity` and `Infinity`, excluding `NaN`. ** `field:[* TO *]` finds all documents where the field has a value between `-Infinity` and `Infinity`, excluding `NaN`.
** `field:*` finds all documents where the field exists (i.e. has any value). ** `field:*` finds all documents where the field exists (i.e., has any value).
* Pure negative queries (all clauses prohibited) are allowed (only as a top-level clause) * Pure negative queries (all clauses prohibited) are allowed (only as a top-level clause)
** `-inStock:false` finds all field values where inStock is not false ** `-inStock:false` finds all field values where inStock is not false
** `-field:*` finds all documents without a value for the field. ** `-field:*` finds all documents without a value for the field.