SOLR-10892: Phase 2: large pages with lots of tables + lots of heading cleanups & TOC placement changes

This commit is contained in:
Cassandra Targett 2017-06-26 22:16:40 -05:00
parent 8da926e7cb
commit 93c96b06fb
9 changed files with 1738 additions and 1512 deletions

File diff suppressed because it is too large Load Diff

View File

@ -1,6 +1,7 @@
= CoreAdmin API
:page-shortname: coreadmin-api
:page-permalink: coreadmin-api.html
:page-toclevels: 1
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
@ -35,19 +36,13 @@ The `STATUS` action returns the status of all running Solr cores, or status for
`admin/cores?action=STATUS&core=_core-name_`
[[CoreAdminAPI-Input]]
=== *Input*
=== STATUS Parameters
*Query Parameters*
`core`::
The name of a core, as listed in the "name" attribute of a `<core>` element in `solr.xml`.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
[cols="15,10,10,10,55",options="header"]
|===
|Parameter |Type |Required |Default |Description
|core |string |No | |The name of a core, as listed in the "name" attribute of a `<core>` element in `solr.xml`.
|indexInfo |boolean |No |true |If **false**, information about the index will not be returned with a core STATUS request. In Solr implementations with a large number of cores (i.e., more than hundreds), retrieving the index information for each core can take a lot of time and isn't always required.
|===
`indexInfo`::
If `false`, information about the index will not be returned with a core STATUS request. In Solr implementations with a large number of cores (i.e., more than hundreds), retrieving the index information for each core can take a lot of time and isn't always required. The default is `true`.
[[CoreAdminAPI-CREATE]]
== CREATE
@ -65,52 +60,60 @@ Note that this command is the only one of the Core Admin API commands that *does
====
Your CREATE call must be able to find a configuration, or it will not succeed.
When you are running SolrCloud and create a new core for a collection, the configuration will be inherited from the collection. Each collection is linked to a configName, which is stored in the ZooKeeper database. This satisfies the config requirement. There is something to note, though if you're running SolrCloud, you should *NOT* be using the CoreAdmin API at all. Use the Collections API.
When you are running SolrCloud and create a new core for a collection, the configuration will be inherited from the collection. Each collection is linked to a configName, which is stored in ZooKeeper. This satisfies the config requirement. There is something to note, though if you're running SolrCloud, you should *NOT* be using the CoreAdmin API at all. Use the <<collections-api.adoc#collections-api,Collections API>>.
When you are not running SolrCloud, if you have <<config-sets.adoc#config-sets,Config Sets>> defined, you can use the configSet parameter as documented below. If there are no config sets, then the instanceDir specified in the CREATE call must already exist, and it must contain a `conf` directory which in turn must contain `solrconfig.xml`, your schema, which is usually named either `managed-schema` or `schema.xml`, and any files referenced by those configs.
When you are not running SolrCloud, if you have <<config-sets.adoc#config-sets,Config Sets>> defined, you can use the configSet parameter as documented below. If there are no config sets, then the `instanceDir` specified in the CREATE call must already exist, and it must contain a `conf` directory which in turn must contain `solrconfig.xml`, your schema (usually named either `managed-schema` or `schema.xml`), and any files referenced by those configs.
The config and schema filenames can be specified with the config and schema parameters, but these are expert options. One thing you could do to avoid creating the conf directory is use config and schema parameters that point at absolute paths, but this can lead to confusing configurations unless you fully understand what you are doing.
The config and schema filenames can be specified with the `config` and `schema` parameters, but these are expert options. One thing you could do to avoid creating the `conf` directory is use `config` and `schema` parameters that point at absolute paths, but this can lead to confusing configurations unless you fully understand what you are doing.
====
.CREATE and the core.properties file
.CREATE and the `core.properties` file
[IMPORTANT]
====
The core.properties file is built as part of the CREATE command. If you create a core.properties file yourself in a core directory and then try to use CREATE to add that core to Solr, you will get an error telling you that another core is already defined there. The core.properties file must NOT exist before calling the CoreAdmin API with the CREATE command.
The `core.properties` file is built as part of the CREATE command. If you create a `core.properties` file yourself in a core directory and then try to use CREATE to add that core to Solr, you will get an error telling you that another core is already defined there. The `core.properties` file must NOT exist before calling the CoreAdmin API with the CREATE command.
====
[[CoreAdminAPI-Input.1]]
=== *Input*
=== CREATE Core Parameters
*Query Parameters*
`name`::
The name of the new core. Same as `name` on the `<core>` element. This parameter is required.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`instanceDir`::
The directory where files for this core should be stored. Same as `instanceDir` on the `<core>` element. The default is the value specified for the `name` parameter if not supplied.
[cols="15,10,10,10,55",options="header"]
|===
|Parameter |Type |Required |Default |Description
|name |string |Yes |N/A |The name of the new core. Same as "name" on the `<core>` element.
|instanceDir |string |No |The value specified for "name" parameter |The directory where files for this SolrCore should be stored. Same as `instanceDir` on the `<core>` element.
|config |string |No | |Name of the config file (i.e., `solrconfig.xml`) relative to `instanceDir`.
|schema |string |No | |Name of the schema file to use for the core. Please note that if you are using a "managed schema" (the default behavior) then any value for this property which does not match the effective `managedSchemaResourceName` will be read once, backed up, and converted for managed schema use. See <<schema-factory-definition-in-solrconfig.adoc#schema-factory-definition-in-solrconfig,Schema Factory Definition in SolrConfig>> for details.
|dataDir |string |No | |Name of the data directory relative to `instanceDir`.
|configSet |string |No | |Name of the configset to use for this core. For more information, see the section <<config-sets.adoc#config-sets,Config Sets>>.
|collection |string |No | |The name of the collection to which this core belongs. The default is the name of the core. `collection.<param>=<value>` causes a property of `<param>=<value>` to be set if a new collection is being created. Use `collection.configName=<configname>` to point to the configuration for a new collection.
|shard |string |No | |The shard id this core represents. Normally you want to be auto-assigned a shard id.
|property.__name__=__value__ |string |No | |Sets the core property _name_ to __value__. See the section on defining <<defining-core-properties.adoc#Definingcore.properties-core.properties_files,core.properties file contents>>.
|async |string |No | |Request ID to track this action which will be processed asynchronously
|===
`config`::
Name of the config file (i.e., `solrconfig.xml`) relative to `instanceDir`.
Use `collection.configName=<configname>` to point to the config for a new collection.
`schema`::
Name of the schema file to use for the core. Please note that if you are using a "managed schema" (the default behavior) then any value for this property which does not match the effective `managedSchemaResourceName` will be read once, backed up, and converted for managed schema use. See <<schema-factory-definition-in-solrconfig.adoc#schema-factory-definition-in-solrconfig,Schema Factory Definition in SolrConfig>> for details.
[[CoreAdminAPI-Example]]
=== Example
`dataDir`::
Name of the data directory relative to `instanceDir`.
`\http://localhost:8983/solr/admin/cores?action=CREATE&name=my_core&collection=my_collection&shard=shard2`
`configSet`::
Name of the configset to use for this core. For more information, see the section <<config-sets.adoc#config-sets,Config Sets>>.
`collection`::
The name of the collection to which this core belongs. The default is the name of the core. `collection._param_=_value_` causes a property of `_param_=_value_` to be set if a new collection is being created. Use `collection.configName=_config-name_` to point to the configuration for a new collection.
+
WARNING: While it's possible to create a core for a non-existent collection, this approach is not supported and not recommended. Always create a collection using the <<collections-api.adoc#collections-api,Collections API>> before creating a core directly for it.
`shard`::
The shard id this core represents. Normally you want to be auto-assigned a shard id.
`property._name_=_value_`::
Sets the core property _name_ to _value_. See the section on defining <<defining-core-properties.adoc#Definingcore.properties-core.properties_files,core.properties file contents>>.
`async`::
Request ID to track this action which will be processed asynchronously.
Use `collection.configName=_configname_` to point to the config for a new collection.
=== CREATE Example
[source,bash]
http://localhost:8983/solr/admin/cores?action=CREATE&name=my_core&collection=my_collection&shard=shard2
[WARNING]
====
While it's possible to create a core for a non-existent collection, this approach is not supported and not recommended. Always create a collection using the <<collections-api.adoc#collections-api,Collections API>> before creating a core directly for it.
====
[[CoreAdminAPI-RELOAD]]
== RELOAD
@ -126,18 +129,10 @@ This is useful when you've made changes to a Solr core's configuration on disk,
RELOAD performs "live" reloads of SolrCore, reusing some existing objects. Some configuration options, such as the `dataDir` location and `IndexWriter`-related settings in `solrconfig.xml` can not be changed and made active with a simple RELOAD action.
====
[[CoreAdminAPI-Input.2]]
=== Input
=== RELOAD Core Parameters
*Query Parameters*
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
[cols="15,10,10,10,55",options="header"]
|===
|Parameter |Type |Required |Default |Description
|core |string |Yes |N/A |The name of the core, as listed in the "name" attribute of a `<core>` element in `solr.xml`.
|===
`core`::
The name of the core, as listed in the "name" attribute of a `<core>` element in `solr.xml`. This parameter is required.
[[CoreAdminAPI-RENAME]]
== RENAME
@ -146,20 +141,17 @@ The `RENAME` action changes the name of a Solr core.
`admin/cores?action=RENAME&core=_core-name_&other=_other-core-name_`
[[CoreAdminAPI-Input.3]]
=== Input
=== RENAME Parameters
**Query Parameters**
`core`::
The name of the Solr core to be renamed. This parameter is required.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`other`::
The new name for the Solr core. If the persistent attribute of `<solr>` is `true`, the new name will be written to `solr.xml` as the `name` attribute of the `<core>` attribute. This parameter is required.
`async`::
Request ID to track this action which will be processed asynchronously.
[cols="15,10,10,10,55",options="header"]
|===
|Parameter |Type |Required |Default |Description
|core |string |Yes | |The name of the Solr core to be renamed.
|other |string |Yes | |The new name for the Solr core. If the persistent attribute of `<solr>` is `true`, the new name will be written to `solr.xml` as the `name` attribute of the `<core>` attribute.
|async |string |No | |Request ID to track this action which will be processed asynchronously
|===
[[CoreAdminAPI-SWAP]]
== SWAP
@ -175,20 +167,17 @@ Do not use `SWAP` with a SolrCloud node. It is not supported and can result in t
====
[[CoreAdminAPI-Input.4]]
=== Input
=== SWAP Parameters
*Query Parameters*
`core`::
The name of one of the cores to be swapped. This parameter is required.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`other`::
The name of one of the cores to be swapped. This parameter is required.
`async`::
Request ID to track this action which will be processed asynchronously.
[cols="15,10,10,10,55",options="header"]
|===
|Parameter |Type |Required |Default |Description
|core |string |Yes | |The name of one of the cores to be swapped.
|other |string |Yes | |The name of one of the cores to be swapped.
|async |string |No | |Request ID to track this action which will be processed asynchronously
|===
[[CoreAdminAPI-UNLOAD]]
== UNLOAD
@ -204,22 +193,23 @@ The `UNLOAD` action requires a parameter (`core`) identifying the core to be rem
Unloading all cores in a SolrCloud collection causes the removal of that collection's metadata from ZooKeeper.
====
[[CoreAdminAPI-Input.5]]
=== Input
=== UNLOAD Parameters
*Query Parameters*
`core`::
The name of a core to be removed. This parameter is required.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`deleteIndex`::
If `true`, will remove the index when unloading the core. The default is `false`.
`deleteDataDir`::
If `true`, removes the `data` directory and all sub-directories. The default is `false`.
`deleteInstanceDir`::
If `true`, removes everything related to the core, including the index directory, configuration files and other related files. The default is `false`.
`async`::
Request ID to track this action which will be processed asynchronously.
[cols="15,10,10,10,55",options="header"]
|===
|Parameter |Type |Required |Default |Description
|core |string |Yes | |The name of one of the cores to be removed.
|deleteIndex |boolean |No |false |If true, will remove the index when unloading the core.
|deleteDataDir |boolean |No |false |If true, removes the `data` directory and all sub-directories.
|deleteInstanceDir |boolean |No |false |If true, removes everything related to the core, including the index directory, configuration files and other related files.
|async |string |No | |Request ID to track this action which will be processed asynchronously
|===
[[CoreAdminAPI-MERGEINDEXES]]
== MERGEINDEXES
@ -238,21 +228,20 @@ This approach allows us to define cores that may not have an index path that is
We can make this call run asynchronously by specifying the `async` parameter and passing a request-id. This id can then be used to check the status of the already submitted task using the REQUESTSTATUS API.
[[CoreAdminAPI-Input.6]]
=== Input
=== MERGEINDEXES Parameters
*Query Parameters*
`core`::
The name of the target core/index. This parameter is required.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`indexDir`::
Multi-valued, directories that would be merged.
`srcCore`::
Multi-valued, source cores that would be merged.
`async`::
Request ID to track this action which will be processed asynchronously
[cols="15,10,10,10,55",options="header"]
|===
|Parameter |Type |Required |Default |Description
|core |string |Yes | |The name of the target core/index.
|indexDir |string | | |Multi-valued, directories that would be merged.
|srcCore |string | | |Multi-valued, source cores that would be merged.
|async |string | | |Request ID to track this action which will be processed asynchronously
|===
[[CoreAdminAPI-SPLIT]]
== SPLIT
@ -261,55 +250,56 @@ The `SPLIT` action splits an index into two or more indexes. The index being spl
The `SPLIT` action supports five parameters, which are described in the table below.
[[CoreAdminAPI-Input.7]]
=== Input
=== SPLIT Parameters
*Query Parameters*
`core`::
The name of the core to be split. This parameter is required.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`path`::
Multi-valued, the directory path in which a piece of the index will be written. Either this parameter or `targetCore` must be specified. If this is specified, the `targetCore` parameter may not be used.
[cols="15,10,10,10,55",options="header"]
|===
|Parameter |Type |Required |Default |Description
|core |string |Yes | |The name of the core to be split.
|path |string | | |Multi-valued, the directory path in which a piece of the index will be written.
|targetCore |string | | |Multi-valued, the target Solr core to which a piece of the index will be merged
|ranges |string |No | |A comma-separated list of hash ranges in hexadecimal format
|split.key |string |No | |The key to be used for splitting the index
|async |string |No | |Request ID to track this action which will be processed asynchronously
|===
`targetCore`::
Multi-valued, the target Solr core to which a piece of the index will be merged. Either this parameter or `path` must be specified. If this is specified, the `path` parameter may not be used.
[IMPORTANT]
====
Either `path` or `targetCore` parameter must be specified but not both. The ranges and split.key parameters are optional and only one of the two should be specified, if at all required.
====
`ranges`::
A comma-separated list of hash ranges in hexadecimal format. If this parameter is used, `split.key` should not be. See the <<SPLIT Examples>> below for an example of how this parameter can be used.
[[CoreAdminAPI-Examples]]
=== Examples
`split.key`::
The key to be used for splitting the index. If this parameter is used, `ranges` should not be. See the <<SPLIT Examples>> below for an example of how this parameter can be used.
`async`::
Request ID to track this action which will be processed asynchronously.
=== SPLIT Examples
The `core` index will be split into as many pieces as the number of `path` or `targetCore` parameters.
==== Usage with two targetCore parameters:
`\http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&targetCore=core2`
[source,bash]
http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&targetCore=core2
Here the `core` index will be split into two pieces and merged into the two `targetCore` indexes.
==== Usage with two path parameters:
`\http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&path=/path/to/index/1&path=/path/to/index/2`
[source,bash]
http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&path=/path/to/index/1&path=/path/to/index/2
The `core` index will be split into two pieces and written into the two directory paths specified.
==== Usage with the split.key parameter:
`\http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&split.key=A!`
[source,bash]
http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&split.key=A!
Here all documents having the same route key as the `split.key` i.e. 'A!' will be split from the `core` index and written to the `targetCore`.
==== Usage with ranges parameter:
`\http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&targetCore=core2&targetCore=core3&ranges=0-1f4,1f5-3e8,3e9-5dc`
[source,bash]
http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&targetCore=core2&targetCore=core3&ranges=0-1f4,1f5-3e8,3e9-5dc
This example uses the `ranges` parameter with hash ranges 0-500, 501-1000 and 1001-1500 specified in hexadecimal. Here the index will be split into three pieces with each targetCore receiving documents matching the hash ranges specified i.e. core1 will get documents with hash range 0-500, core2 will receive documents with hash range 501-1000 and finally, core3 will receive documents with hash range 1001-1500. At least one hash range must be specified. Please note that using a single hash range equal to a route key's hash range is NOT equivalent to using the `split.key` parameter because multiple route keys can hash to the same range.
@ -324,22 +314,17 @@ Request the status of an already submitted asynchronous CoreAdmin API call.
`admin/cores?action=REQUESTSTATUS&requestid=_id_`
[[CoreAdminAPI-Input.8]]
=== Input
=== Core REQUESTSTATUS Parameters
*Query Parameters*
The REQUESTSTATUS command has only one parameter.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`requestid`::
The user defined request-id for the asynchronous request. This parameter is required.
[cols="15,10,10,10,55",options="header"]
|===
|Parameter |Type |Required |Default |Description
|requestid |string |Yes | |The user defined request-id for the Asynchronous request.
|===
The call below will return the status of an already submitted asynchronous CoreAdmin call.
The call below will return the status of an already submitted Asynchronous CoreAdmin call.
`\http://localhost:8983/solr/admin/cores?action=REQUESTSTATUS&requestid=1`
[source,bash]
http://localhost:8983/solr/admin/cores?action=REQUESTSTATUS&requestid=1
[[CoreAdminAPI-REQUESTRECOVERY]]
== REQUESTRECOVERY
@ -348,22 +333,15 @@ The `REQUESTRECOVERY` action manually asks a core to recover by synching with th
`admin/cores?action=REQUESTRECOVERY&core=_core-name_`
[[CoreAdminAPI-Input.9]]
=== Input
=== REQUESTRECOVERY Parameters
*Query Parameters*
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
[cols="15,10,10,10,55",options="header"]
|===
|Parameter |Type |Required |Default |Description
|core |string |Yes | |The name of the core to re-sync.
|===
`core`::
The name of the core to re-sync. This parameter is required.
[[CoreAdminAPI-Examples.1]]
=== Examples
=== REQUESTRECOVERY Examples
`\http://localhost:8981/solr/admin/cores?action=REQUESTRECOVERY&core=gettingstarted_shard1_replica1`
[source,bash]
http://localhost:8981/solr/admin/cores?action=REQUESTRECOVERY&core=gettingstarted_shard1_replica1
The core to specify can be found by expanding the appropriate ZooKeeper node via the admin UI.

View File

@ -55,66 +55,98 @@ As you can see, the discovery Solr configuration is "SolrCloud friendly". Howeve
There are no attributes that you can specify in the `<solr>` tag, which is the root element of `solr.xml`. The tables below list the child nodes of each XML element in `solr.xml`.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`adminHandler`::
This attribute does not need to be set.
+
If used, this attribute should be set to the FQN (Fully qualified name) of a class that inherits from CoreAdminHandler. For example, `<str name="adminHandler">com.myorg.MyAdminHandler</str>` would configure the custom admin handler (MyAdminHandler) to handle admin requests.
+
If this attribute isn't set, Solr uses the default admin handler, `org.apache.solr.handler.admin.CoreAdminHandler`.
`collectionsHandler`::
As above, for custom CollectionsHandler implementations.
`infoHandler`::
As above, for custom InfoHandler implementations.
`coreLoadThreads`::
Specifies the number of threads that will be assigned to load cores in parallel.
`coreRootDirectory`::
The root of the core discovery tree, defaults to `$SOLR_HOME` (by default, `server/solr`).
`managementPath`::
Currently non-operational.
`sharedLib`::
Specifies the path to a common library directory that will be shared across all cores. Any JAR files in this directory will be added to the search path for Solr plugins. This path is relative to `$SOLR_HOME`. Custom handlers may be placed in this directory.
`shareSchema`::
This attribute, when set to `true`, ensures that the multiple cores pointing to the same Schema resource file will be referring to the same IndexSchema Object. Sharing the IndexSchema Object makes loading the core faster. If you use this feature, make sure that no core-specific property is used in your Schema file.
`transientCacheSize`::
Defines how many cores with `transient=true` that can be loaded before swapping the least recently used core for a new core.
`configSetBaseDir`::
The directory under which configSets for Solr cores can be found. Defaults to `$SOLR_HOME/configsets`.
[cols="30,70",options="header"]
|===
|Node |Description
|`adminHandler` |If used, this attribute should be set to the FQN (Fully qualified name) of a class that inherits from CoreAdminHandler. For example, `<str name="adminHandler">com.myorg.MyAdminHandler</str>` would configure the custom admin handler (MyAdminHandler) to handle admin requests. If this attribute isn't set, Solr uses the default admin handler, `org.apache.solr.handler.admin.CoreAdminHandler`. For more information on this parameter, see the Solr Wiki at http://wiki.apache.org/solr/CoreAdmin#cores.
|`collectionsHandler` |As above, for custom CollectionsHandler implementations.
| `infoHandler` |As above, for custom InfoHandler implementations.
|`coreLoadThreads` |Specifies the number of threads that will be assigned to load cores in parallel.
|`coreRootDirectory` |The root of the core discovery tree, defaults to `$SOLR_HOME`.
|`managementPath` |Currently non-operational.
|`sharedLib` |Specifies the path to a common library directory that will be shared across all cores. Any JAR files in this directory will be added to the search path for Solr plugins. This path is relative to `$SOLR_HOME`. Custom handlers may be placed in this directory.
|`shareSchema` |This attribute, when set to true, ensures that the multiple cores pointing to the same Schema resource file will be referring to the same IndexSchema Object. Sharing the IndexSchema Object makes loading the core faster. If you use this feature, make sure that no core-specific property is used in your Schema file.
|`transientCacheSize` |Defines how many cores with transient=true that can be loaded before swapping the least recently used core for a new core.
|`configSetBaseDir` |The directory under which configsets for Solr cores can be found. Defaults to `$SOLR_HOME/configsets`.
|===
=== The <solrcloud> Element
This element defines several parameters that relate so SolrCloud. This section is ignored unless theSolr instance is started with either `-DzkRun` or `-DzkHost`
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`distribUpdateConnTimeout`::
Used to set the underlying `connTimeout` for intra-cluster updates.
`distribUpdateSoTimeout`::
Used to set the underlying `socketTimeout` for intra-cluster updates.
`host`::
The hostname Solr uses to access cores.
`hostContext`::
The url context path.
`hostPort`::
The port Solr uses to access cores.
+
In the default `solr.xml` file, this is set to `${jetty.port:8983}`, which will use the Solr port defined in Jetty, and otherwise fall back to 8983.
`leaderVoteWait`::
When SolrCloud is starting up, how long each Solr node will wait for all known replicas for that shard to be found before assuming that any nodes that haven't reported are down.
`leaderConflictResolveWait`::
When trying to elect a leader for a shard, this property sets the maximum time a replica will wait to see conflicting state information to be resolved; temporary conflicts in state information can occur when doing rolling restarts, especially when the node hosting the Overseer is restarted.
+
Typically, the default value of `180000` (ms) is sufficient for conflicts to be resolved; you may need to increase this value if you have hundreds or thousands of small collections in SolrCloud.
`zkClientTimeout`::
A timeout for connection to a ZooKeeper server. It is used with SolrCloud.
`zkHost`::
In SolrCloud mode, the URL of the ZooKeeper host that Solr should use for cluster state information.
`genericCoreNodeNames`::
If `TRUE`, node names are not based on the address of the node, but on a generic name that identifies the core. When a different machine takes over serving that core things will be much easier to understand.
`zkCredentialsProvider` & `zkACLProvider`::
Optional parameters that can be specified if you are using <<zookeeper-access-control.adoc#zookeeper-access-control,ZooKeeper Access Control>>.
[cols="30,70",options="header"]
|===
|Node |Description
|`distribUpdateConnTimeout` |Used to set the underlying "connTimeout" for intra-cluster updates.
|`distribUpdateSoTimeout` |Used to set the underlying "socketTimeout" for intra-cluster updates.
|`host` |The hostname Solr uses to access cores.
|`hostContext` |The url context path.
|`hostPort` |The port Solr uses to access cores. In the default `solr.xml` file, this is set to `${jetty.port:8983`}, which will use the Solr port defined in Jetty, and otherwise fall back to 8983.
|`leaderVoteWait` |When SolrCloud is starting up, how long each Solr node will wait for all known replicas for that shard to be found before assuming that any nodes that haven't reported are down.
|`leaderConflictResolveWait` |When trying to elect a leader for a shard, this property sets the maximum time a replica will wait to see conflicting state information to be resolved; temporary conflicts in state information can occur when doing rolling restarts, especially when the node hosting the Overseer is restarted. Typically, the default value of 180000 (ms) is sufficient for conflicts to be resolved; you may need to increase this value if you have hundreds or thousands of small collections in SolrCloud.
|`zkClientTimeout` |A timeout for connection to a ZooKeeper server. It is used with SolrCloud.
|`zkHost` |In SolrCloud mode, the URL of the ZooKeeper host that Solr should use for cluster state information.
|`genericCoreNodeNames` |If `TRUE`, node names are not based on the address of the node, but on a generic name that identifies the core. When a different machine takes over serving that core things will be much easier to understand.
|`zkCredentialsProvider` & ` zkACLProvider` |Optional parameters that can be specified if you are using <<zookeeper-access-control.adoc#zookeeper-access-control,ZooKeeper Access Control>>.
|===
=== The <logging> Element
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`class`::
The class to use for logging. The corresponding JAR file must be available to Solr, perhaps through a `<lib>` directive in `solrconfig.xml`.
[cols="30,70",options="header"]
|===
|Node |Description
|`class` |The class to use for logging. The corresponding JAR file must be available to Solr, perhaps through a `<lib>` directive in `solrconfig.xml`.
|`enabled` |true/false - whether to enable logging or not.
|===
`enabled`::
true/false - whether to enable logging or not.
==== The <logging><watcher> Element
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`size`::
The number of log events that are buffered.
[cols="30,70",options="header"]
|===
|Node |Description
|`size` |The number of log events that are buffered.
|`threshold` |The logging level above which your particular logging implementation will record. For example when using log4j one might specify DEBUG, WARN, INFO, etc.
|===
`threshold`::
The logging level above which your particular logging implementation will record. For example when using log4j one might specify DEBUG, WARN, INFO, etc.
=== The <shardHandlerFactory> Element
@ -127,26 +159,40 @@ Custom shard handlers can be defined in `solr.xml` if you wish to create a custo
Since this is a custom shard handler, sub-elements are specific to the implementation. The default and only shard handler provided by Solr is the HttpShardHandlerFactory in which case, the following sub-elements can be specified:
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`socketTimeout`::
The read timeout for intra-cluster query and administrative requests. The default is the same as the `distribUpdateSoTimeout` specified in the `<solrcloud>` section.
`connTimeout`::
The connection timeout for intra-cluster query and administrative requests. Defaults to the `distribUpdateConnTimeout` specified in the `<solrcloud>` section.
`urlScheme`::
The URL scheme to be used in distributed search.
`maxConnectionsPerHost`::
Maximum connections allowed per host. Defaults to `20`.
`maxConnections`::
Maximum total connections allowed. Defaults to `10000`.
`corePoolSize`::
The initial core size of the threadpool servicing requests. Default is `0`.
`maximumPoolSize`::
The maximum size of the threadpool servicing requests. Default is unlimited.
`maxThreadIdleTime`::
The amount of time in seconds that idle threads persist for in the queue, before being killed. Default is `5` seconds.
`sizeOfQueue`::
If the threadpool uses a backing queue, what is its maximum size to use direct handoff. Default is to use a SynchronousQueue.
`fairnessPolicy`::
A boolean to configure if the threadpool favors fairness over throughput. Default is false to favor throughput.
[cols="30,70",options="header"]
|===
|Node |Description
|`socketTimeout` |The read timeout for intra-cluster query and administrative requests. The default is the same as the `distribUpdateSoTimeout` specified in the `<solrcloud>` section.
|`connTimeout` |The connection timeout for intra-cluster query and administrative requests. Defaults to the `distribUpdateConnTimeout` specified in the `<solrcloud>` section
|`urlScheme` |URL scheme to be used in distributed search
|`maxConnectionsPerHost` |Maximum connections allowed per host. Defaults to 20
|`maxConnections` |Maximum total connections allowed. Defaults to 10000
|`corePoolSize` |The initial core size of the threadpool servicing requests. Default is 0.
|`maximumPoolSize` |The maximum size of the threadpool servicing requests. Default is unlimited.
|`maxThreadIdleTime` |The amount of time in seconds that idle threads persist for in the queue, before being killed. Default is 5 seconds.
|`sizeOfQueue` |If the threadpool uses a backing queue, what is its maximum size to use direct handoff. Default is to use a SynchronousQueue.
|`fairnessPolicy` |A boolean to configure if the threadpool favours fairness over throughput. Default is false to favor throughput.
|===
=== The <metrics> Element
The `<metrics>` element in `solr.xml` allows you to customize the metrics reported by Solr. You can define system properties that should not be returned, or define custom suppliers and reporters.
The `<metrics>` element in `solr.xml` allows you to customize the metrics reported by Solr. You can define system properties that should not be returned, or define custom suppliers and reporters.
In a default `solr.xml` you will not see any `<metrics>` configuration. If you would like to customize the metrics for your installation, see the section <<metrics-reporting.adoc#metrics-configuration,Metrics Configuration>>.

View File

@ -1,6 +1,7 @@
= Function Queries
:page-shortname: function-queries
:page-permalink: function-queries.html
:page-tocclass: right
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
@ -80,57 +81,90 @@ Only functions with fast random access are recommended.
The table below summarizes the functions available for function queries.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
=== abs Function
Returns the absolute value of the specified value or function.
[cols="20,40,40",options="header"]
|===
|Function |Description |Syntax Examples
|abs |Returns the absolute value of the specified value or function. |`abs(x)` `abs(-5)`
*Syntax Example*
|childfield(field) |Returns the value of the given field for one of the matched child docs when searching by <<other-parsers.adoc#OtherParsers-BlockJoinParentQueryParser,{!parent}>>. It can be used only in `sort` parameter. a|
* `sort=childfield(name) asc` implies `$q` as a second argument and therefor it assumes `q={!parent ..}..`;
* `abs(x)` `abs(-5)`
=== childfield(field) Function
Returns the value of the given field for one of the matched child docs when searching by <<other-parsers.adoc#OtherParsers-BlockJoinParentQueryParser,{!parent}>>. It can be used only in `sort` parameter.
*Syntax Examples*
* `sort=childfield(name) asc` implies `$q` as a second argument and therefore it assumes `q={!parent ..}..`;
* `sort=childfield(field,$bjq) asc` refers to a separate parameter `bjq={!parent ..}..`;
* `sort=childfield(field,{!parent of=...}...) desc` allows to inline block join parent query
|concat(v,f..)|concatenates the given string fields, literals and other functions |`concat(name," ",$param,def(opt,"-"))`
=== concat Function
Concatenates the given string fields, literals and other functions.
|"constant" |Specifies a floating point constant. |`1.5`
*Syntax Example*
|def |`def` is short for default. Returns the value of field "field", or if the field does not exist, returns the default value specified. and yields the first value where `exists()==true`.) |`def(rating,5):` This `def()` function returns the rating, or if no rating specified in the doc, returns 5 `def(myfield, 1.0):` equivalent to `if(exists(myfield),myfield,1.0)`
* `concat(name," ",$param,def(opt,"-"))`
|div |Divides one value or function by another. div(x,y) divides x by y. a|`div(1,y)`
=== "constant" Function
`div(sum(x,100),max(y,1))`
Specifies a floating point constant.
|dist |Return the distance between two vectors (points) in an n-dimensional space. Takes in the power, plus two or more ValueSource instances and calculates the distances between the two vectors. Each ValueSource must be a number. There must be an even number of ValueSource instances passed in and the method assumes that the first half represent the first vector and the second half represent the second vector. a|`dist(2, x, y, 0, 0):` calculates the Euclidean distance between (0,0) and (x,y) for each document.
*Syntax Example*
`dist(1, x, y, 0, 0)`: calculates the Manhattan (taxicab) distance between (0,0) and (x,y) for each document.
* `1.5`
`dist(2, x,y,z,0,0,0):` Euclidean distance between (0,0,0) and (x,y,z) for each document.
=== def Function
`def` is short for default. Returns the value of field "field", or if the field does not exist, returns the default value specified. Yields the first value where `exists()==true`.
`dist(1,x,y,z,e,f,g)`: Manhattan distance between (x,y,z) and (e,f,g) where each letter is a field name.
*Syntax Example*
|docfreq(field,val) |Returns the number of documents that contain the term in the field. This is a constant (the same value for all documents in the index).
* `def(rating,5)`: This `def()` function returns the rating, or if no rating specified in the doc, returns 5 `def(myfield, 1.0):` equivalent to `if(exists(myfield),myfield,1.0)`
You can quote the term if it's more complex, or do parameter substitution for the term value. a|`docfreq(text,'solr')`
=== div Function
Divides one value or function by another. `div(x,y)` divides `x` by `y`.
`...&defType=func` `&q=docfreq(text,$myterm)&myterm=solr`
*Syntax Examples*
|field[[FunctionQueries-field]] a|
* `div(1,y)`
* `div(sum(x,100),max(y,1))`
=== dist Function
Returns the distance between two vectors (points) in an n-dimensional space. Takes in the power, plus two or more ValueSource instances and calculates the distances between the two vectors. Each ValueSource must be a number.
There must be an even number of ValueSource instances passed in and the method assumes that the first half represent the first vector and the second half represent the second vector.
*Syntax Examples*
* `dist(2, x, y, 0, 0)`: calculates the Euclidean distance between (0,0) and (x,y) for each document.
* `dist(1, x, y, 0, 0)`: calculates the Manhattan (taxicab) distance between (0,0) and (x,y) for each document.
* `dist(2, x,y,z,0,0,0):` Euclidean distance between (0,0,0) and (x,y,z) for each document.
* `dist(1,x,y,z,e,f,g)`: Manhattan distance between (x,y,z) and (e,f,g) where each letter is a field name.
=== docfreq(field,val) Function
Returns the number of documents that contain the term in the field. This is a constant (the same value for all documents in the index).
You can quote the term if it's more complex, or do parameter substitution for the term value.
*Syntax Examples*
* `docfreq(text,'solr')`
* `...&defType=func` `&q=docfreq(text,$myterm)&myterm=solr`
[[FunctionQueries-field]]
=== field Function
Returns the numeric docValues or indexed value of the field with the specified name. In its simplest (single argument) form, this function can only be used on single valued fields, and can be called using the name of the field as a string, or for most conventional field names simply use the field name by itself with out using the `field(...)` syntax.
When using docValues, an optional 2nd argument can be specified to select the "`min"` or "```max```" value of multivalued fields.
When using docValues, an optional 2nd argument can be specified to select the `min` or `max` value of multivalued fields.
0 is returned for documents without a value in the field.
a|
*Syntax Examples*
These 3 examples are all equivalent:
* `myFloatFieldName`
* `field(myFloatFieldName)`
* `field("myFloatFieldName")`
The last form is convinient when your field name is atypical:
The last form is convenient when your field name is atypical:
* `field("my complex float fieldName")`
@ -139,11 +173,21 @@ For multivalued docValues fields:
* `field(myMultiValuedFloatField,min)`
* `field(myMultiValuedFloatField,max)`
|hsin |The Haversine distance calculates the distance between two points on a sphere when traveling along the sphere. The values must be in radians. `hsin` also take a Boolean argument to specify whether the function should convert its output to radians. |`hsin(2, true, x, y, 0, 0)`
=== hsin Function
The Haversine distance calculates the distance between two points on a sphere when traveling along the sphere. The values must be in radians. `hsin` also take a Boolean argument to specify whether the function should convert its output to radians.
|idf |Inverse document frequency; a measure of whether the term is common or rare across all documents. Obtained by dividing the total number of documents by the number of documents containing the term, and then taking the logarithm of that quotient. See also `tf`. |`idf(fieldName,'solr')`: measures the inverse of the frequency of the occurrence of the term `'solr'` in `fieldName`.
*Syntax Example*
|if a|
* `hsin(2, true, x, y, 0, 0)`
=== idf Function
Inverse document frequency; a measure of whether the term is common or rare across all documents. Obtained by dividing the total number of documents by the number of documents containing the term, and then taking the logarithm of that quotient. See also `tf`.
*Syntax Example*
* `idf(fieldName,'solr')`: measures the inverse of the frequency of the occurrence of the term `'solr'` in `fieldName`.
=== if Function
Enables conditional function queries. In `if(test,value1,value2)`:
* `test` is or refers to a logical value or expression that returns a logical value (TRUE or FALSE).
@ -152,174 +196,312 @@ Enables conditional function queries. In `if(test,value1,value2)`:
An expression can be any function which outputs boolean values, or even functions returning numeric values, in which case value 0 will be interpreted as false, or strings, in which case empty string is interpreted as false.
a|`if(termfreq` `(cat,'electronics'),` `popularity,42)`: This function checks each document for the to see if it contains the term "```electronics```" in the `cat` field. If it does, then the value of the `popularity` field is returned, otherwise the value of `42` is returned.
*Syntax Example*
|linear |Implements `m*x+c` where `m` and `c` are constants and `x` is an arbitrary function. This is equivalent to `sum(product(m,x),c)`, but slightly more efficient as it is implemented as a single function. a|`linear(x,m,c)`
* `if(termfreq (cat,'electronics'),popularity,42)`: This function checks each document for to see if it contains the term "electronics" in the `cat` field. If it does, then the value of the `popularity` field is returned, otherwise the value of `42` is returned.
`linear(x,2,4)` returns `2*x+4`
=== linear Function
Implements `m*x+c` where `m` and `c` are constants and `x` is an arbitrary function. This is equivalent to `sum(product(m,x),c)`, but slightly more efficient as it is implemented as a single function.
|log |Returns the log base 10 of the specified function. a|
`log(x)`
*Syntax Examples*
`log(sum(x,100))`
* `linear(x,m,c)`
* `linear(x,2,4)`: returns `2*x+4`
|map |Maps any values of an input function x that fall within min and max inclusive to the specified target. The arguments min and max must be constants. The arguments `target` and `default` can be constants or functions. If the value of x does not fall between min and max, then either the value of x is returned, or a default value is returned if specified as a 5th argument. a|
`map(x,min,max,target)` `map(x,0,0,1)` - changes any values of 0 to 1. This can be useful in handling default 0 values.
=== log Function
Returns the log base 10 of the specified function.
`map(x,min,max,target,default)` `map(x,0,100,1,-1)` - changes any values between `0` and `100` to `1`, and all other values to` -1`.
*Syntax Examples*
`map(x,0,100,` `sum(x,599),` `docfreq(text,solr))` - changes any values between `0` and `100` to x+599, and all other values to frequency of the term 'solr' in the field text.
* `log(x)`
* `log(sum(x,100))`
|max a|
Returns the maximum numeric value of multiple nested functions or constants, which are specified as arguments: `max(x,y,...)`. The max function can also be useful for "bottoming out" another function or field at some specified constant.
=== map Function
Maps any values of an input function `x` that fall within `min` and `max` inclusive to the specified `target`. The arguments `min` and `max` must be constants. The arguments `target` and `default` can be constants or functions.
(Use the `field(myfield,max)` syntax for <<FunctionQueries-field,selecting the maximum value of a single multivalued field>>)
If the value of `x` does not fall between `min` and `max`, then either the value of `x` is returned, or a default value is returned if specified as a 5th argument.
|`max(myfield,myotherfield,0)`
*Syntax Examples*
|maxdoc |Returns the number of documents in the index, including those that are marked as deleted but have not yet been purged. This is a constant (the same value for all documents in the index). |`maxdoc()`
* `map(x,min,max,target)`
** `map(x,0,0,1)`: Changes any values of 0 to 1. This can be useful in handling default 0 values.
* `map(x,min,max,target,default)`
** `map(x,0,100,1,-1)`: Changes any values between `0` and `100` to `1`, and all other values to` -1`.
* `map(x,0,100,sum(x,599),docfreq(text,solr))`: Changes any values between `0` and `100` to x+599, and all other values to frequency of the term 'solr' in the field text.
|min a|
Returns the minimum numeric value of multiple nested functions of constants, which are specified as arguments: `min(x,y,...)`. The min function can also be useful for providing an "upper bound" on a function using a constant.
=== max Function
Returns the maximum numeric value of multiple nested functions or constants, which are specified as arguments: `max(x,y,...)`. The `max` function can also be useful for "bottoming out" another function or field at some specified constant.
(Use the `field(myfield,min)` <<FunctionQueries-field,syntax for selecting the minimum value of a single multivalued field>>)
Use the `field(myfield,max)` syntax for <<FunctionQueries-field,selecting the maximum value of a single multivalued field>>.
|`min(myfield,myotherfield,0)`
*Syntax Example*
|ms a|
Returns milliseconds of difference between its arguments. Dates are relative to the Unix or POSIX time epoch, midnight, January 1, 1970 UTC. Arguments may be the name of an indexed `TrieDateField`, or date math based on a <<working-with-dates.adoc#working-with-dates,constant date or `NOW`>>.
* `max(myfield,myotherfield,0)`
=== maxdoc Function
Returns the number of documents in the index, including those that are marked as deleted but have not yet been purged. This is a constant (the same value for all documents in the index).
*Syntax Example*
* `maxdoc()`
=== min Function
Returns the minimum numeric value of multiple nested functions of constants, which are specified as arguments: `min(x,y,...)`. The `min` function can also be useful for providing an "upper bound" on a function using a constant.
Use the `field(myfield,min)` <<FunctionQueries-field,syntax for selecting the minimum value of a single multivalued field>>.
*Syntax Example*
* `min(myfield,myotherfield,0)`
=== ms Function
Returns milliseconds of difference between its arguments. Dates are relative to the Unix or POSIX time epoch, midnight, January 1, 1970 UTC.
Arguments may be the name of an indexed `TrieDateField`, or date math based on a <<working-with-dates.adoc#working-with-dates,constant date or `NOW`>>.
* `ms()`: Equivalent to `ms(NOW)`, number of milliseconds since the epoch.
* `ms(a):` Returns the number of milliseconds since the epoch that the argument represents.
* `ms(a,b)` : Returns the number of milliseconds that b occurs before a (that is, a - b) a|`ms(NOW/DAY)`
* `ms(a,b)` : Returns the number of milliseconds that b occurs before a (that is, a - b)
`ms(2000-01-01T00:00:00Z)`
*Syntax Examples*
`ms(mydatefield)`
* `ms(NOW/DAY)`
* `ms(2000-01-01T00:00:00Z)`
* `ms(mydatefield)`
* `ms(NOW,mydatefield)`
* `ms(mydatefield, 2000-01-01T00:00:00Z)`
* `ms(datefield1, datefield2)`
`ms(NOW,mydatefield)`
=== norm(_field_) Function
Returns the "norm" stored in the index for the specified field. This is the product of the index time boost and the length normalization factor, according to the {lucene-javadocs}/core/org/apache/lucene/search/similarities/Similarity.html[Similarity] for the field.
`ms(mydatefield, 2000-01-01T00:00:00Z)`
*Syntax Example*
`ms(datefield1, datefield2)`
* `norm(fieldName)`
|norm(_field_) |Returns the "norm" stored in the index for the specified field. This is the product of the index time boost and the length normalization factor, according to the {lucene-javadocs}/core/org/apache/lucene/search/similarities/Similarity.html[Similarity] for the field. |`norm(fieldName)`
=== numdocs Function
Returns the number of documents in the index, not including those that are marked as deleted but have not yet been purged. This is a constant (the same value for all documents in the index).
|numdocs |Returns the number of documents in the index, not including those that are marked as deleted but have not yet been purged. This is a constant (the same value for all documents in the index). |`numdocs()`
*Syntax Example*
|ord a|
Returns the ordinal of the indexed field value within the indexed list of terms for that field in Lucene index order (lexicographically ordered by unicode value), starting at 1. In other words, for a given field, all values are ordered lexicographically; this function then returns the offset of a particular value in that ordering. The field must have a maximum of one value per document (not multi-valued). 0 is returned for documents without a value in the field.
* `numdocs()`
[IMPORTANT]
====
`ord()` depends on the position in an index and can change when other documents are inserted or deleted.
====
=== ord Function
Returns the ordinal of the indexed field value within the indexed list of terms for that field in Lucene index order (lexicographically ordered by unicode value), starting at 1.
In other words, for a given field, all values are ordered lexicographically; this function then returns the offset of a particular value in that ordering. The field must have a maximum of one value per document (not multi-valued). `0` is returned for documents without a value in the field.
IMPORTANT: `ord()` depends on the position in an index and can change when other documents are inserted or deleted.
See also `rord` below.
a|`ord(myIndexedField)`
*Syntax Example*
Example: If there were only three values ("apple","banana","pear") for a particular field X, then `ord(X)` would be `1` for documents containing "apple", `2` for documents containing "banana", etc.
* `ord(myIndexedField)`
|payload a|
Returns the float value computed from the decoded payloads of the term specified. The return value is computed using the `min`, `max`, or `average` of the decoded payloads. A special `first` function can be used instead of the others, to short-circuit term enumeration and return only the decoded payload of the first term. The field specified must have float or integer payload encoding capability (via `DelimitedPayloadTokenFilter` or `NumericPayloadTokenFilter`). If no payload is found for the term, the default value is returned.
* If there were only three values ("apple","banana","pear") for a particular field X, then `ord(X)` would be `1` for documents containing "apple", `2` for documents containing "banana", etc.
=== payload Function
Returns the float value computed from the decoded payloads of the term specified.
The return value is computed using the `min`, `max`, or `average` of the decoded payloads. A special `first` function can be used instead of the others, to short-circuit term enumeration and return only the decoded payload of the first term.
The field specified must have float or integer payload encoding capability (via `DelimitedPayloadTokenFilter` or `NumericPayloadTokenFilter`). If no payload is found for the term, the default value is returned.
* `payload(field_name,term)`: default value is 0.0, `average` function is used.
* `payload(field_name,term,default_value)`: default value can be a constant, field name, or another float returning function. `average` function used.
* `payload(field_name,term,default_value,function)`: function values can be `min`, `max`, `average`, or `first`. |`payload(payloaded_field_dpf,term,0.0,first)`
* `payload(field_name,term,default_value,function)`: function values can be `min`, `max`, `average`, or `first`.
|pow |Raises the specified base to the specified power. `pow(x,y)` raises x to the power of y. a|`pow(x,y)`
*Syntax Example*
`pow(x,log(y))`
* `payload(payloaded_field_dpf,term,0.0,first)`
`pow(x,0.5):` the same as `sqrt`
=== pow Function
|product |Returns the product of multiple values or functions, which are specified in a comma-separated list. `mul(...)` may also be used as an alias for this function. |`product(x,y,...)` `product(x,2)` `product(x,y)mul(x,y)`
Raises the specified base to the specified power. `pow(x,y)` raises `x` to the power of `y`.
|query |Returns the score for the given subquery, or the default value for documents not matching the query. Any type of subquery is supported through either parameter de-referencing `$otherparam` or direct specification of the query string in the <<local-parameters-in-queries.adoc#local-parameters-in-queries,Local Parameters>> through the `v` key. a|`query(subquery, default)`
*Syntax Examples*
`q=product` `(popularity,` `query({!dismax v='solr rocks'})`: returns the product of the popularity and the score of the DisMax query.
* `pow(x,y)`
* `pow(x,log(y))`
* `pow(x,0.5):` the same as `sqrt`
`q=product` `(popularity,` `query($qq))&qq={!dismax}solr rocks`: equivalent to the previous query, using parameter de-referencing.
=== product Function
Returns the product of multiple values or functions, which are specified in a comma-separated list. `mul(...)` may also be used as an alias for this function.
`q=product` `(popularity,` `query($qq,0.1))` `&qq={!dismax}` `solr rocks`: specifies a default score of 0.1 for documents that don't match the DisMax query.
|recip a|
*Syntax Examples*
* `product(x,y,...)`
* `product(x,2)`
* `product(x,y)mul(x,y)`
=== query Function
Returns the score for the given subquery, or the default value for documents not matching the query. Any type of subquery is supported through either parameter de-referencing `$otherparam` or direct specification of the query string in the <<local-parameters-in-queries.adoc#local-parameters-in-queries,Local Parameters>> through the `v` key.
*Syntax Examples*
* `query(subquery, default)`
* `q=product (popularity,query({!dismax v='solr rocks'})`: returns the product of the popularity and the score of the DisMax query.
* `q=product (popularity,query($qq))&qq={!dismax}solr rocks`: equivalent to the previous query, using parameter de-referencing.
* `q=product (popularity,query($qq,0.1))&qq={!dismax}solr rocks`: specifies a default score of 0.1 for documents that don't match the DisMax query.
=== recip Function
Performs a reciprocal function with `recip(x,m,a,b)` implementing `a/(m*x+b)` where `m,a,b` are constants, and `x` is any arbitrarily complex function.
When a and b are equal, and x>=0, this function has a maximum value of 1 that drops as x increases. Increasing the value of a and b together results in a movement of the entire function to a flatter part of the curve. These properties can make this an ideal function for boosting more recent documents when x is `rord(datefield)`.
When `a` and `b` are equal, and `x>=0`, this function has a maximum value of `1` that drops as `x` increases. Increasing the value of `a` and `b` together results in a movement of the entire function to a flatter part of the curve. These properties can make this an ideal function for boosting more recent documents when x is `rord(datefield)`.
a|`recip(myfield,m,a,b)`
*Syntax Examples*
`recip(rord` `(creationDate), 1,1000,1000)`
* `recip(myfield,m,a,b)`
* `recip(rord` `(creationDate), 1,1000,1000)`
|rord |Returns the reverse ordering of that returned by `ord`. |`rord(myDateField)`
=== rord Function
Returns the reverse ordering of that returned by `ord`.
|scale a|
Scales values of the function x such that they fall between the specified `minTarget` and `maxTarget` inclusive. The current implementation traverses all of the function values to obtain the min and max, so it can pick the correct scale.
*Syntax Example*
The current implementation cannot distinguish when documents have been deleted or documents that have no value. It uses 0.0 values for these cases. This means that if values are normally all greater than 0.0, one can still end up with 0.0 as the min value to map from. In these cases, an appropriate map() function could be used as a workaround to change 0.0 to a value in the real range, as shown here: `scale(map(x,0,0,5),1,2)` a|`scale(x, minTarget, maxTarget)`
* `rord(myDateField)`
`scale(x,1,2)`: scales the values of x such that all values will be between 1 and 2 inclusive.
=== scale Function
Scales values of the function `x` such that they fall between the specified `minTarget` and `maxTarget` inclusive. The current implementation traverses all of the function values to obtain the min and max, so it can pick the correct scale.
|sqedist |The Square Euclidean distance calculates the 2-norm (Euclidean distance) but does not take the square root, thus saving a fairly expensive operation. It is often the case that applications that care about Euclidean distance do not need the actual distance, but instead can use the square of the distance. There must be an even number of ValueSource instances passed in and the method assumes that the first half represent the first vector and the second half represent the second vector. |`sqedist(x_td, y_td, 0, 0)`
The current implementation cannot distinguish when documents have been deleted or documents that have no value. It uses `0.0` values for these cases. This means that if values are normally all greater than `0.0`, one can still end up with `0.0` as the `min` value to map from. In these cases, an appropriate `map()` function could be used as a workaround to change `0.0` to a value in the real range, as shown here: `scale(map(x,0,0,5),1,2)`
|sqrt |Returns the square root of the specified value or function. |`sqrt(x)sqrt(100)sqrt(sum(x,100))`
*Syntax Examples*
|strdist a|Calculate the distance between two strings. Uses the Lucene spell checker `StringDistance` interface and supports all of the implementations available in that package, plus allows applications to plug in their own via Solr's resource loading capabilities. `strdist` takes (string1, string2, distance measure).
* `scale(x, minTarget, maxTarget)`
* `scale(x,1,2)`: scales the values of x such that all values will be between 1 and 2 inclusive.
=== sqedist Function
The Square Euclidean distance calculates the 2-norm (Euclidean distance) but does not take the square root, thus saving a fairly expensive operation. It is often the case that applications that care about Euclidean distance do not need the actual distance, but instead can use the square of the distance. There must be an even number of ValueSource instances passed in and the method assumes that the first half represent the first vector and the second half represent the second vector.
*Syntax Example*
* `sqedist(x_td, y_td, 0, 0)`
=== sqrt Function
Returns the square root of the specified value or function.
*Syntax Example*
* `sqrt(x)sqrt(100)sqrt(sum(x,100))`
=== strdist Function
Calculate the distance between two strings. Uses the Lucene spell checker `StringDistance` interface and supports all of the implementations available in that package, plus allows applications to plug in their own via Solr's resource loading capabilities. `strdist` takes (string1, string2, distance measure).
Possible values for distance measure are:
* jw: Jaro-Winkler
* edit: Levenstein or Edit distance
* ngram: The NGramDistance, if specified, can optionally pass in the ngram size too. Default is 2.
* FQN: Fully Qualified class Name for an implementation of the StringDistance interface. Must have a no-arg constructor. |`strdist("SOLR",id,edit)`
* FQN: Fully Qualified class Name for an implementation of the StringDistance interface. Must have a no-arg constructor.
|sub |Returns `x-y` from `sub(x,y)`. a|`sub(myfield,myfield2)`
*Syntax Example*
`sub(100, sqrt(myfield))`
* `strdist("SOLR",id,edit)`
|sum |Returns the sum of multiple values or functions, which are specified in a comma-separated list. `add(...)` may be used as an alias for this function a|`sum(x,y,...) sum(x,1)`
=== sub Function
Returns `x-y` from `sub(x,y)`.
`sum(x,y)`
*Syntax Examples*
`sum(sqrt(x),log(y),z,0.5)`
* `sub(myfield,myfield2)`
* `sub(100, sqrt(myfield))`
`add(x,y)`
=== sum Function
Returns the sum of multiple values or functions, which are specified in a comma-separated list. `add(...)` may be used as an alias for this function.
|sumtotaltermfreq |Returns the sum of `totaltermfreq` values for all terms in the field in the entire index (i.e., the number of indexed tokens for that field). (Aliases `sumtotaltermfreq` to `sttf`.) a|If doc1:(fieldX:A B C) and doc2:(fieldX:A A A A):
*Syntax Examples*
`docFreq(fieldX:A)` = 2 (A appears in 2 docs)
* `sum(x,y,...) sum(x,1)`
* `sum(x,y)`
* `sum(sqrt(x),log(y),z,0.5)`
* `add(x,y)`
`freq(doc1, fieldX:A)` = 4 (A appears 4 times in doc 2)
=== sumtotaltermfreq Function
Returns the sum of `totaltermfreq` values for all terms in the field in the entire index (i.e., the number of indexed tokens for that field). (Aliases `sumtotaltermfreq` to `sttf`.)
`totalTermFreq(fieldX:A)` = 5 (A appears 5 times across all docs)
*Syntax Example*
If doc1:(fieldX:A B C) and doc2:(fieldX:A A A A):
`sumTotalTermFreq(fieldX)` = 7 in `fieldX`, there are 5 As, 1 B, 1 C
* `docFreq(fieldX:A)` = 2 (A appears in 2 docs)
* `freq(doc1, fieldX:A)` = 4 (A appears 4 times in doc 2)
* `totalTermFreq(fieldX:A)` = 5 (A appears 5 times across all docs)
* `sumTotalTermFreq(fieldX)` = 7 in `fieldX`, there are 5 As, 1 B, 1 C
|termfreq |Returns the number of times the term appears in the field for that document. |`termfreq(text,'memory')`
=== termfreq Function
Returns the number of times the term appears in the field for that document.
|tf |Term frequency; returns the term frequency factor for the given term, using the {lucene-javadocs}/core/org/apache/lucene/search/similarities/Similarity.html[Similarity] for the field. The `tf-idf` value increases proportionally to the number of times a word appears in the document, but is offset by the frequency of the word in the document, which helps to control for the fact that some words are generally more common than others. See also `idf`. |`tf(text,'solr')`
*Syntax Example*
|top a|
* `termfreq(text,'memory')`
=== tf Function
Term frequency; returns the term frequency factor for the given term, using the {lucene-javadocs}/core/org/apache/lucene/search/similarities/Similarity.html[Similarity] for the field. The `tf-idf` value increases proportionally to the number of times a word appears in the document, but is offset by the frequency of the word in the document, which helps to control for the fact that some words are generally more common than others. See also `idf`.
*Syntax Examples*
* `tf(text,'solr')`
=== top Function
Causes the function query argument to derive its values from the top-level IndexReader containing all parts of an index. For example, the ordinal of a value in a single segment will be different from the ordinal of that same value in the complete index.
The `ord()` and `rord()` functions implicitly use `top()`, and hence `ord(foo)` is equivalent to `top(ord(foo))`. |
The `ord()` and `rord()` functions implicitly use `top()`, and hence `ord(foo)` is equivalent to `top(ord(foo))`.
|totaltermfreq |Returns the number of times the term appears in the field in the entire index. (Aliases `totaltermfreq` to `ttf`.) |`ttf(text,'memory')`
|===
=== totaltermfreq Function
Returns the number of times the term appears in the field in the entire index. (Aliases `totaltermfreq` to `ttf`.)
*Syntax Example*
* `ttf(text,'memory')`
== Boolean Functions
The following functions are boolean they return true or false. They are mostly useful as the first argument of the `if` function, and some of these can be combined. If used somewhere else, it will yield a '1' or '0'.
[width="100%",options="header",]
|===
|Function |Description |Syntax Examples
|and |Returns a value of true if and only if all of its operands evaluate to true. |`and(not` `(exists` `(popularity)),` `exists` `(price)):` returns `true` for any document which has a value in the `price` field, but does not have a value in the `popularity` field
|or |A logical disjunction. |`or(value1,value2):` TRUE if either `value1` or `value2` is true.
|xor |Logical exclusive disjunction, or one or the other but not both. |`xor(field1,field2)` returns TRUE if either `field1` or `field2` is true; FALSE if both are true.
|not |The logically negated value of the wrapped function. |`not(exists(author))`: TRUE only when `exists(author)` is false.
|exists |Returns TRUE if any member of the field exists. |`exists(author)` returns TRUE for any document has a value in the "author" field. `exists(query(price:5.00))` returns TRUE if "price" matches "5.00".
|gt, gte, lt, lte, eq |5 comparison functions: Greater Than, Greater Than or Equal, Less Than, Less Than or Equal, Equal |`if(lt(ms(mydatefield),315569259747),0.8,1)` translates to this pseudocode: `if mydatefield < 315569259747 then 0.8 else 1`
|===
=== and Function
Returns a value of true if and only if all of its operands evaluate to true.
*Syntax Example*
* `and(not(exists(popularity)),exists(price))`: returns `true` for any document which has a value in the `price` field, but does not have a value in the `popularity` field.
=== or Function
A logical disjunction.
*Syntax Example*
* `or(value1,value2):` `true` if either `value1` or `value2` is true.
=== xor Function
Logical exclusive disjunction, or one or the other but not both.
*Syntax Example*
* `xor(field1,field2)` returns `true` if either `field1` or `field2` is true; FALSE if both are true.
=== not Function
The logically negated value of the wrapped function.
*Syntax Example*
* `not(exists(author))`: `true` only when `exists(author)` is false.
=== exists Function
Returns `true` if any member of the field exists.
*Syntax Example*
* `exists(author)`: returns `true` for any document has a value in the "author" field.
* `exists(query(price:5.00))`: returns `true` if "price" matches "5.00".
=== Comparison Functions
`gt`, `gte`, `lt`, `lte`, `eq`
5 comparison functions: Greater Than, Greater Than or Equal, Less Than, Less Than or Equal, Equal
*Syntax Example*
* `if(lt(ms(mydatefield),315569259747),0.8,1)` translates to this pseudocode: `if mydatefield < 315569259747 then 0.8 else 1`
[[FunctionQueries-ExampleFunctionQueries]]
== Example Function Queries

View File

@ -27,32 +27,68 @@ Highlighting is extremely configurable, perhaps more than any other part of Solr
[[Highlighting-Usage]]
== Usage
=== Common Highlighter Parameters
You only need to set the `hl` and often `hl.fl` parameters to get results. The following table documents these and some other supported parameters. Note that many highlighting parameters support per-field overrides, such as: `f._title_txt_.hl.snippets`
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`hl`::
Use this parameter to enable or disable highlighting. The default is `false`. If you want to use highlighting, you must set this to `true`.
[cols="20,15,65",options="header"]
|===
|Parameter |Default |Description
|hl |false |Use this parameter to enable or disable highlighting.
|hl.method |original |The highlighting implementation to use. Acceptable values are: `unified`, `original`, `fastVector`. See the <<Highlighting-ChoosingaHighlighter,Choosing a Highlighter>> section below for more details on the differences between the available highlighters.
|hl.fl |_(df=)_ |Specifies a list of fields to highlight. Accepts a comma- or space-delimited list of fields for which Solr should generate highlighted snippets. A wildcard of `\*` (asterisk) can be used to match field globs, such as `text_*` or even `\*` to highlight on all fields where highlighting is possible. When using `\*`, consider adding `hl.requireFieldMatch=true`.
|hl.q |_(q=)_ |A query to use for highlighting. This parameter allows you to highlight different terms than those being used to retrieve documents.
|hl.qparser |_(defType=)_ |The query parser to use for the `hl.q` query.
|hl.requireFieldMatch |false a|
By default, *false*, all query terms will be highlighted for each field to be highlighted (`hl.fl`) no matter what fields the parsed query refer to. If set to *true*, only query terms aligning with the field being highlighted will in turn be highlighted.
`hl.method`::
The highlighting implementation to use. Acceptable values are: `unified`, `original`, `fastVector`. The default is `original`.
+
See the <<Highlighting-ChoosingaHighlighter,Choosing a Highlighter>> section below for more details on the differences between the available highlighters.
Note: if the query references fields different from the field being highlighted and they have different text analysis, the query may not highlight query terms it should have and vice versa. The analysis used is that of the field being highlighted (`hl.fl`), not the query fields.
`hl.fl`::
Specifies a list of fields to highlight. Accepts a comma- or space-delimited list of fields for which Solr should generate highlighted snippets.
+
A wildcard of `\*` (asterisk) can be used to match field globs, such as `text_*` or even `\*` to highlight on all fields where highlighting is possible. When using `*`, consider adding `hl.requireFieldMatch=true`.
+
When not defined, the defaults defined for the `df` query parameter will be used.
|hl.usePhraseHighlighter |true |If set to *true*, Solr will highlight phrase queries (and other advanced position-sensitive queries) accurately as phrases. If *false*, the parts of the phrase will be highlighted everywhere instead of only when it forms the given phrase.
|hl.highlightMultiTerm |true |If set to *true*, Solr will highlight wildcard queries (and other `MultiTermQuery` subclasses). If *false*, they won't be highlighted at all.
|hl.snippets |1 |Specifies maximum number of highlighted snippets to generate per field. It is possible for any number of snippets from zero to this value to be generated.
|hl.fragsize |100 |Specifies the approximate size, in characters, of fragments to consider for highlighting. *0* indicates that no fragmenting should be considered and the whole field value should be used.
|hl.tag.pre |<em> |(`hl.simple.pre` for the Original Highlighter) Specifies the “tag” to use before a highlighted term. This can be any string, but is most often an HTML or XML tag.
|hl.tag.post |</em> |(`hl.simple.post` for the Original Highlighter) Specifies the “tag” to use after a highlighted term. This can be any string, but is most often an HTML or XML tag.
|hl.encoder |_(blank)_ |If blank, the default, then the stored text will be returned without any escaping/encoding performed by the highlighter. If set to *html* then special HMTL/XML characters will be encoded (e.g. `&` becomes `\&amp;`). The pre/post snippet characters are never encoded.
|hl.maxAnalyzedChars |51200 |The character limit to look for highlights, after which no highlighting will be done. This is mostly only a performance concern for an _analysis_ based offset source since it's the slowest. See <<Schema Options and Performance Considerations>>.
|===
`hl.q`::
A query to use for highlighting. This parameter allows you to highlight different terms than those being used to retrieve documents.
+
When not defined, the query defined with the `q` parameter will the used.
`hl.qparser`::
The query parser to use for the `hl.q` query.
+
When not defined, the query parser defined with the `defType` query parameter will be used.
`hl.requireFieldMatch`::
By default, `false`, all query terms will be highlighted for each field to be highlighted (`hl.fl`) no matter what fields the parsed query refer to. If set to `true`, only query terms aligning with the field being highlighted will in turn be highlighted.
+
If the query references fields different from the field being highlighted and they have different text analysis, the query may not highlight query terms it should have and vice versa. The analysis used is that of the field being highlighted (`hl.fl`), not the query fields.
`hl.usePhraseHighlighter`::
If set to `true`, the default, Solr will highlight phrase queries (and other advanced position-sensitive queries) accurately as phrases. If `false`, the parts of the phrase will be highlighted everywhere instead of only when it forms the given phrase.
`hl.highlightMultiTerm`::
If set to `true`, the default, Solr will highlight wildcard queries (and other `MultiTermQuery` subclasses). If `false`, they won't be highlighted at all.
`hl.snippets`::
Specifies maximum number of highlighted snippets to generate per field. It is possible for any number of snippets from zero to this value to be generated. The default is `1`.
`hl.fragsize`::
Specifies the approximate size, in characters, of fragments to consider for highlighting. The default is `100`. Using `0` indicates that no fragmenting should be considered and the whole field value should be used.
`hl.tag.pre`::
(`hl.simple.pre` for the Original Highlighter) Specifies the “tag” to use before a highlighted term. This can be any string, but is most often an HTML or XML tag.
+
The default is `<em>`.
`hl.tag.post`:: </em> |
(`hl.simple.post` for the Original Highlighter) Specifies the “tag” to use after a highlighted term. This can be any string, but is most often an HTML or XML tag.
+
The default is `</em>`.
`hl.encoder`::
If blank, the default, then the stored text will be returned without any escaping/encoding performed by the highlighter. If set to `html` then special HMTL/XML characters will be encoded (e.g. `&` becomes `\&amp;`). The pre/post snippet characters are never encoded.
`hl.maxAnalyzedChars`::
The character limit to look for highlights, after which no highlighting will be done. This is mostly only a performance concern for an _analysis_ based offset source since it's the slowest. See <<Schema Options and Performance Considerations>>.
+
The default is `51200` characters.
There are more parameters supported as well depending on the highlighter (via `hl.method`) chosen.
@ -73,30 +109,26 @@ we get a response such as this (truncated slightly for space):
[source,json]
----
{
"responseHeader": {
"..."
}
},
"response": {
"numFound": 1,
"start": 0,
"docs": [{
"id": "MA147LL/A",
"name": "Apple 60 GB iPod with Video Playback Black",
"manu": "Apple Computer Inc.",
"cat": [
"electronics",
"music"
]
}]
},
"highlighting": {
"MA147LL/A": {
"manu": [
"<em>Apple</em> Computer Inc."
]
}
"response": {
"numFound": 1,
"start": 0,
"docs": [{
"id": "MA147LL/A",
"name": "Apple 60 GB iPod with Video Playback Black",
"manu": "Apple Computer Inc.",
"cat": [
"electronics",
"music"
]
}]
},
"highlighting": {
"MA147LL/A": {
"manu": [
"<em>Apple</em> Computer Inc."
]
}
}
}
----
@ -171,49 +203,114 @@ This adds substantial weight to the index similar in size to the compressed
The Unified Highlighter supports these following additional parameters to the ones listed earlier:
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`hl.offsetSource`:: _(blank)_ |
By default, the Unified Highlighter will usually pick the right offset source (see above). However it may be ambiguous such as during a migration from one offset source to another that hasn't completed.
+
The offset source can be explicitly configured to one of: `ANALYSIS`, `POSTINGS`, `POSTINGS_WITH_TERM_VECTORS`, or `TERM_VECTORS`.
`hl.tag.ellipsis`::
By default, each snippet is returned as a separate value (as is done with the other highlighters). Set this parameter to instead return one string with this text as the delimiter. _Note: this is likely to be removed in the future._
`hl.defaultSummary`::
If `true`, use the leading portion of the text as a snippet if a proper highlighted snippet can't otherwise be generated. The default is `false`.
`hl.score.k1`::
Specifies BM25 term frequency normalization parameter 'k1'. For example, it can be set to `0` to rank passages solely based on the number of query terms that match. The default is `1.2`.
`hl.score.b`::
Specifies BM25 length normalization parameter 'b'. For example, it can be set to "0" to ignore the length of passages entirely when ranking. The default is `0.75`.
`hl.score.pivot`::
Specifies BM25 average passage length in characters. The default is `87`.
`hl.bs.language`::
Specifies the breakiterator language for dividing the document into passages.
`hl.bs.country`::
Specifies the breakiterator country for dividing the document into passages.
`hl.bs.variant`::
Specifies the breakiterator variant for dividing the document into passages.
`hl.bs.type`::
Specifies the breakiterator type for dividing the document into passages. Can be `SEPARATOR`, `SENTENCE`, `WORD`*, `CHARACTER`, `LINE`, or `WHOLE`. `SEPARATOR` is special value that splits text on a user-provided character in `hl.bs.separator`.
+
The default is `SENTENCE`.
`hl.bs.separator`::
Indicates which character to break the text on. Use only if you have defined `hl.bs.type=SEPARATOR`.
+
This is useful when the text has already been manipulated in advance to have a special delineation character at desired highlight passage boundaries. This character will still appear in the text as the last character of a passage.
[cols="20,15,65",options="header"]
|===
|Parameter |Default |Description
|hl.offsetSource |_(blank)_ |By default, the Unified Highlighter will usually pick the right offset source (see above). However it may be ambiguous such as during a migration from one offset source to another that hasn't completed. The offset source can be explicitly configured to one of: *ANALYSIS,* *POSTINGS*, *POSTINGS_WITH_TERM_VECTORS*, *TERM_VECTORS*
|hl.tag.ellipsis |_(blank)_ |By default, each snippet is returned as a separate value (as is done with the other highlighters). Set this parameter to instead return one string with this text as the delimiter. _Note: this is likely to be removed in the future._
|hl.defaultSummary |false |If *true*, use the leading portion of the text as a snippet if a proper highlighted snippet can't otherwise be generated.
|hl.score.k1 |1.2 |Specifies BM25 term frequency normalization parameter 'k1'. For example, it can be set to "0" to rank passages solely based on the number of query terms that match.
|hl.score.b |0.75 |Specifies BM25 length normalization parameter 'b'. For example, it can be set to "0" to ignore the length of passages entirely when ranking.
|hl.score.pivot |87 |Specifies BM25 average passage length in characters.
|hl.bs.language |_(blank)_ |Specifies the breakiterator language for dividing the document into passages.
|hl.bs.country |_(blank)_ |Specifies the breakiterator country for dividing the document into passages.
|hl.bs.variant |_(blank)_ |Specifies the breakiterator variant for dividing the document into passages.
|hl.bs.type |SENTENCE |Specifies the breakiterator type for dividing the document into passages. Can be *SEPARATOR*, *SENTENCE*, *WORD*, *CHARACTER*, *LINE*, or *WHOLE*. SEPARATOR is special value that splits text on a user-provided character in `hl.bs.separator`.
|hl.bs.separator |_(blank)_ |Indicates which character to break the text on. Requires `hl.bs.type=SEPARATOR`. This is useful when the text has already been manipulated in advance to have a special delineation character at desired highlight passage boundaries. This character will still appear in the text as the last character of a passage.
|===
[[Highlighting-TheOriginalHighlighter]]
== The Original Highlighter
The Original Highlighter supports these following additional parameters to the ones listed earlier:
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`hl.mergeContiguous`::
Instructs Solr to collapse contiguous fragments into a single fragment. A value of `true` indicates contiguous fragments will be collapsed into single fragment. The default value, `false`, is also the backward-compatible setting.
[cols="25,15,60",options="header"]
|===
|Parameter |Default |Description
|hl.mergeContiguous |false |Instructs Solr to collapse contiguous fragments into a single fragment. A value of *true* indicates contiguous fragments will be collapsed into single fragment. The default value, *false*, is also the backward-compatible setting.
|hl.maxMultiValuedToExamine |`Integer.MAX_VALUE` |Specifies the maximum number of entries in a multi-valued field to examine before stopping. This can potentially return zero results if the limit is reached before any matches are found. If used with the `maxMultiValuedToMatch`, whichever limit is reached first will determine when to stop looking.
|hl.maxMultiValuedToMatch |`Integer.MAX_VALUE` |Specifies the maximum number of matches in a multi-valued field that are found before stopping. If `hl.maxMultiValuedToExamine` is also defined, whichever limit is reached first will determine when to stop looking.
|hl.alternateField |_(blank)_ |Specifies a field to be used as a backup default summary if Solr cannot generate a snippet (i.e., because no terms match).
|hl.maxAlternateFieldLength |_(unlimited)_ |Specifies the maximum number of characters of the field to return. Any value less than or equal to 0 means the field's length is unlimited. This parameter is only used in conjunction with the `hl.alternateField` parameter.
|hl.highlightAlternate |true |If set to *true*, and `hl.alternateFieldName` is active, Solr will show the entire alternate field, with highlighting of occurrences. If `hl.maxAlternateFieldLength=N` is used, Solr returns max `N` characters surrounding the best matching fragment. If set to *false*, or if there is no match in the alternate field either, the alternate field will be shown without highlighting.
|hl.formatter |simple |Selects a formatter for the highlighted output. Currently the only legal value is *simple*, which surrounds a highlighted term with a customizable pre- and post-text snippet.
|hl.simple.prehl.simple.post |<em> and </em> |Specifies the text that should appear before (`hl.simple.pre`) and after (`hl.simple.post`) a highlighted term, when using the simple formatter.
|hl.fragmenter |gap |Specifies a text snippet generator for highlighted text. The standard fragmenter is *gap*, which creates fixed-sized fragments with gaps for multi-valued fields. Another option is *regex*, which tries to create fragments that resemble a specified regular expression.
|hl.regex.slop |0.6 |When using the regex fragmenter (`hl.fragmenter=regex`), this parameter defines the factor by which the fragmenter can stray from the ideal fragment size (given by `hl.fragsize`) to accommodate a regular expression. For instance, a slop of 0.2 with `hl.fragsize=100` should yield fragments between 80 and 120 characters in length. It is usually good to provide a slightly smaller `hl.fragsize` value when using the regex fragmenter.
|hl.regex.pattern |_(blank)_ |Specifies the regular expression for fragmenting. This could be used to extract sentences.
|hl.regex.maxAnalyzedChars |10000 |Instructs Solr to analyze only this many characters from a field when using the regex fragmenter (after which, the fragmenter produces fixed-sized fragments). Applying a complicated regex to a huge field is computationally expensive.
|hl.preserveMulti |false |If *true*, multi-valued fields will return all values in the order they were saved in the index. If *false*, only values that match the highlight request will be returned.
|hl.payloads |_(automatic)_ |When `hl.usePhraseHighlighter` is true and the indexed field has payloads but not term vectors (generally quite rare), the index's payloads will be read into the highlighter's memory index along with the postings. If this may happen and you know you don't need them for highlighting (i.e. your queries don't filter by payload) then you can save a little memory by setting this to false.
|===
`hl.maxMultiValuedToExamine`::
Specifies the maximum number of entries in a multi-valued field to examine before stopping. This can potentially return zero results if the limit is reached before any matches are found.
+
If used with the `maxMultiValuedToMatch`, whichever limit is reached first will determine when to stop looking.
+
The default is `Integer.MAX_VALUE`.
`hl.maxMultiValuedToMatch`::
Specifies the maximum number of matches in a multi-valued field that are found before stopping.
+
If `hl.maxMultiValuedToExamine` is also defined, whichever limit is reached first will determine when to stop looking.
+
The default is `Integer.MAX_VALUE`.
`hl.alternateField`::
Specifies a field to be used as a backup default summary if Solr cannot generate a snippet (i.e., because no terms match).
`hl.maxAlternateFieldLength`::
Specifies the maximum number of characters of the field to return. Any value less than or equal to `0` means the field's length is unlimited (the default behavior).
+
This parameter is only used in conjunction with the `hl.alternateField` parameter.
`hl.highlightAlternate`::
If set to `true`, the default, and `hl.alternateFieldName` is active, Solr will show the entire alternate field, with highlighting of occurrences. If `hl.maxAlternateFieldLength=N` is used, Solr returns max `N` characters surrounding the best matching fragment.
+
If set to `false`, or if there is no match in the alternate field either, the alternate field will be shown without highlighting.
`hl.formatter`::
Selects a formatter for the highlighted output. Currently the only legal value is `simple`, which surrounds a highlighted term with a customizable pre- and post-text snippet.
`hl.simple.prehl.simple.post`::
Specifies the text that should appear before (`hl.simple.pre`) and after (`hl.simple.post`) a highlighted term, when using the simple formatter. The default is `<em>` and `</em>`.
`hl.fragmenter`::
Specifies a text snippet generator for highlighted text. The standard (default) fragmenter is `gap`, which creates fixed-sized fragments with gaps for multi-valued fields.
+
Another option is `regex`, which tries to create fragments that resemble a specified regular expression.
`hl.regex.slop`:: 0.6 |
When using the regex fragmenter (`hl.fragmenter=regex`), this parameter defines the factor by which the fragmenter can stray from the ideal fragment size (given by `hl.fragsize`) to accommodate a regular expression.
+
For instance, a slop of `0.2` with `hl.fragsize=100` should yield fragments between 80 and 120 characters in length. It is usually good to provide a slightly smaller `hl.fragsize` value when using the regex fragmenter.
+
The default is `0.6`.
`hl.regex.pattern`::
Specifies the regular expression for fragmenting. This could be used to extract sentences.
`hl.regex.maxAnalyzedChars`:: 10000 |
Instructs Solr to analyze only this many characters from a field when using the regex fragmenter (after which, the fragmenter produces fixed-sized fragments). The default is `10000`.
+
Note, applying a complicated regex to a huge field is computationally expensive.
`hl.preserveMulti`::
If `true`, multi-valued fields will return all values in the order they were saved in the index. If `false`, the default, only values that match the highlight request will be returned.
`hl.payloads`::
When `hl.usePhraseHighlighter` is `true` and the indexed field has payloads but not term vectors (generally quite rare), the index's payloads will be read into the highlighter's memory index along with the postings.
+
If this may happen and you know you don't need them for highlighting (i.e. your queries don't filter by payload) then you can save a little memory by setting this to false.
The Original Highlighter has a plugin architecture that enables new functionality to be registered in `solrconfig.xml`. The "```techproducts```" configset shows most of these settings explicitly. You can use it as a guide to provide your own components to include a `SolrFormatter`, `SolrEncoder`, and `SolrFragmenter.`
@ -230,18 +327,28 @@ In addition to the initial listed parameters, the following parameters documente
And here are additional parameters supported by the FVH:
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`hl.fragListBuilder`:: weighted |
The snippet fragmenting algorithm. The `weighted` fragListBuilder uses IDF-weights to order fragments. This fragListBuilder is the default.
+
Other options are `single`, which returns the entire field contents as one snippet, or `simple`. You can select a fragListBuilder with this parameter, or modify an existing implementation in `solrconfig.xml` to be the default by adding "default=true".
`hl.fragmentsBuilder`::
The fragments builder is responsible for formatting the fragments, which uses`<em>` and `</em>` markup by default (if `hl.tag.pre` and `hl.tag.post` are not defined).
+
Another pre-configured choice is `colored`, which is an example of how to use the fragments builder to insert HTML into the snippets for colored highlights if you choose. You can also implement your own if you'd like. You can select a fragments builder with this parameter, or modify an existing implementation in `solrconfig.xml` to be the default by adding "default=true".
`hl.boundaryScanner`::
See <<Using Boundary Scanners with the FastVector Highlighter>> below.
`hl.bs.*`::
See <<Using Boundary Scanners with the FastVector Highlighter>> below.
`hl.phraseLimit`::
The maximum number of phrases to analyze when searching for the highest-scoring phrase. The default is `5000`.
`hl.multiValuedSeparatorChar`::
Text to use to separate one value from the next for a multi-valued field. The default is " " (a space).
[cols="20,15,65",options="header"]
|===
|Parameter |Default |Description
|hl.fragListBuilder |weighted |The snippet fragmenting algorithm. The *weighted* fragListBuilder uses IDF-weights to order fragments. Other options are *single*, which returns the entire field contents as one snippet, or *simple*. You can select a fragListBuilder with this parameter, or modify an existing implementation in `solrconfig.xml` to be the default by adding "default=true".
|hl.fragmentsBuilder |default |The fragments builder is responsible for formatting the fragments, which uses <em> and </em> markup (if `hl.tag.pre` and `hl.tag.post` are not defined). Another pre-configured choice is *colored*, which is an example of how to use the fragments builder to insert HTML into the snippets for colored highlights if you choose. You can also implement your own if you'd like. You can select a fragments builder with this parameter, or modify an existing implementation in `solrconfig.xml` to be the default by adding "default=true".
|hl.boundaryScanner | |See <<Using Boundary Scanners with the FastVector Highlighter>> below.
|hl.bs.* | |See <<Using Boundary Scanners with the FastVector Highlighter>> below.
|hl.phraseLimit |5000 |The maximum number of phrases to analyze when searching for the highest-scoring phrase.
|hl.multiValuedSeparatorChar |" " _(space)_ |Text to use to separate one value from the next for a multi-valued field.
|===
[[Highlighting-UsingBoundaryScannerswiththeFastVectorHighlighter]]
=== Using Boundary Scanners with the FastVector Highlighter

View File

@ -178,16 +178,14 @@ The `ComplexPhraseQParser` provides support for wildcards, ORs, etc., inside phr
Under the covers, this query parser makes use of the Span group of queries, e.g., spanNear, spanOr, etc., and is subject to the same limitations as that family or parsers.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
*Parameters*
[cols="30,70",options="header"]
|===
|Parameter |Description
|`inOrder` |Set to true to force phrase queries to match terms in the order specified. Default: *true*
|`df` |The default search field.
|===
`inOrder`::
Set to true to force phrase queries to match terms in the order specified. The default is `true`.
`df`::
The default search field.
*Examples:*
*Examples*
[source,text]
----
@ -284,20 +282,21 @@ Example:
The `FunctionRangeQParser` extends the `QParserPlugin` and creates a range query over a function. This is also referred to as `frange`, as seen in the examples below.
Other parameters:
*Parameters*
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`l`::
The lower bound. This parameter is optional.
[cols="30,70",options="header"]
|===
|Parameter |Description
|l |The lower bound, optional
|u |The upper bound, optional
|incl |Include the lower bound: true/false, optional, default=true
|incu |Include the upper bound: true/false, optional, default=true
|===
`u`::
The upper bound. This parameter is optional.
Examples:
`incl`::
Include the lower bound. This parameter is optional. The default is `true`.
`incu`::
Include the upper bound. This parameter is optional. The default is `true`.
*Examples*
[source,text]
----
@ -318,24 +317,30 @@ For more information about range queries over functions, see Yonik Seeley's intr
The `graph` query parser does a breadth first, cyclic aware, graph traversal of all documents that are "reachable" from a starting set of root documents identified by a wrapped query.
The graph is built according to linkages between documents based on the terms found in "```from```" and "```to```" fields that you specify as part of the query
The graph is built according to linkages between documents based on the terms found in `from` and `to` fields that you specify as part of the query.
[[OtherParsers-Parameters]]
=== Parameters
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`to`::
The field name of matching documents to inspect to identify outgoing edges for graph traversal. Defaults to `edge_ids`.
[cols="30,70",options="header"]
|===
|Parameter |Description
|to |The field name of matching documents to inspect to identify outgoing edges for graph traversal. Defaults to `edge_ids` .
|from |The field name to of candidate documents to inspect to identify incoming graph edges. Defaults to `node_id` .
|traversalFilter |An optional query that can be supplied to limit the scope of documents that are traversed.
|maxDepth |Integer specifying how deep the breadth first search of the graph should go beginning with the initial query. Defaults to -1 (unlimited)
|returnRoot |Boolean to indicate if the documents that matched the original query (to define the starting points for graph) should be included in the final results. Defaults to true
|returnOnlyLeaf |Boolean that indicates if the results of the query should be filtered so that only documents with no outgoing edges are returned. Defaults to false
|useAutn |Boolean that indicates if an Automatons should be compiled for each iteration of the breadth first search, which may be faster for some graphs. Defaults to false.
|===
`from`::
The field name to of candidate documents to inspect to identify incoming graph edges. Defaults to `node_id`.
`traversalFilter`::
An optional query that can be supplied to limit the scope of documents that are traversed.
`maxDepth`::
Integer specifying how deep the breadth first search of the graph should go beginning with the initial query. Defaults to `-1` (unlimited).
`returnRoot`::
Boolean to indicate if the documents that matched the original query (to define the starting points for graph) should be included in the final results. Defaults to `true`.
`returnOnlyLeaf`::
Boolean that indicates if the results of the query should be filtered so that only documents with no outgoing edges are returned. Defaults to `false`.
`useAutn`:: Boolean that indicates if an Automatons should be compiled for each iteration of the breadth first search, which may be faster for some graphs. Defaults to `false`.
[[OtherParsers-Limitations.1]]
=== Limitations
@ -590,23 +595,34 @@ Example:
This query parser takes the following parameters:
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`qf`::
Specifies the fields to use for similarity.
[cols="30,70",options="header"]
|===
|Parameter |Description
|qf |Specifies the fields to use for similarity.
|mintf |Specifies the Minimum Term Frequency, the frequency below which terms will be ignored in the source document.
|mindf |Specifies the Minimum Document Frequency, the frequency at which words will be ignored when they do not occur in at least this many documents.
|maxdf |Specifies the Maximum Document Frequency, the frequency at which words will be ignored when they occur in more than this many documents.
|minwl |Sets the minimum word length below which words will be ignored.
|maxwl |Sets the maximum word length above which words will be ignored.
|maxqt |Sets the maximum number of query terms that will be included in any generated query.
|maxntp |Sets the maximum number of tokens to parse in each example document field that is not stored with TermVector support.
|boost |Specifies if the query will be boosted by the interesting term relevance. It can be either "true" or "false".
|===
`mintf`::
Specifies the Minimum Term Frequency, the frequency below which terms will be ignored in the source document.
Examples:
`mindf`::
Specifies the Minimum Document Frequency, the frequency at which words will be ignored when they do not occur in at least this many documents.
`maxdf`::
Specifies the Maximum Document Frequency, the frequency at which words will be ignored when they occur in more than this many documents.
`minwl`::
Sets the minimum word length below which words will be ignored.
`maxwl`::
Sets the maximum word length above which words will be ignored.
`maxqt`::
Sets the maximum number of query terms that will be included in any generated query.
`maxntp`::
Sets the maximum number of tokens to parse in each example document field that is not stored with TermVector support.
`boost`::
Specifies if the query will be boosted by the interesting term relevance. It can be either "true" or "false".
*Examples*
Find documents like the document with id=1 and using the `name` field for similarity.
@ -663,17 +679,16 @@ The main query, for both of these parsers, is parsed straightforwardly from the
This parser accepts the following parameters:
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`f`::
The field to use (required).
[cols="30,70",options="header"]
|===
|Parameter |Description
|`f` |Field to use (required)
|`func` |Payload function: min, max, average (required)
|`includeSpanScore` |If true, multiples computed payload factor by the score of the original query. If false, the computed payload factor is the score. (default: false)
|===
`func`::
Payload function: min, max, average (required).
Example:
`includeSpanScore`::
If `true`, multiples computed payload factor by the score of the original query. If `false`, the default, the computed payload factor is the score.
*Example*
[source,text]
----
@ -687,20 +702,17 @@ Example:
This parser accepts the following parameters:
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`f`::
The field to use (required).
[cols="30,70",options="header"]
|===
|Parameter |Description
|`f` |Field to use (required)
|payloads a|
`payloads`::
A space-separated list of payloads that must match the query terms (required)
+
Each specified payload will be encoded using the encoder determined from the field type and encoded accordingly for matching.
+
`DelimitedPayloadTokenFilter` 'identity' encoded payloads also work here, as well as float and integer encoded ones.
|===
*Example*
[source,text]
----
@ -761,7 +773,7 @@ q.operators::
Comma-separated list of names of parsing operators to enable. By default, all operations are enabled, and this parameter can be used to effectively disable specific operators as needed, by excluding them from the list. Passing an empty string with this parameter disables all operators.
+
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
+
[cols="15,20,50,15",options="header"]
|===
|Name |Operator |Description |Example query
@ -791,7 +803,8 @@ At the end of terms, specifies a fuzzy query.
q.op::
Defines the default operator to use if none is defined by the user. Allowed values are `AND` and `OR`. `OR` is used if none is specified.
qf:: A list of query fields and boosts to use when building the query.
qf::
A list of query fields and boosts to use when building the query.
df::
Defines the default field if none is defined in the Schema, or overrides the default field if it is already defined.
@ -901,23 +914,23 @@ If no analysis or transformation is desired for any type of field, see the <<Oth
[[OtherParsers-TermsQueryParser]]
== Terms Query Parser
`TermsQParser`, functions similarly to the <<OtherParsers-TermQueryParser,Term Query Parser>> but takes in multiple values separated by commas and returns documents matching any of the specified values.
`TermsQParser` functions similarly to the <<OtherParsers-TermQueryParser,Term Query Parser>> but takes in multiple values separated by commas and returns documents matching any of the specified values.
This can be useful for generating filter queries from the external human readable terms returned by the faceting or terms components, and may be more efficient in some cases than using the <<the-standard-query-parser.adoc#the-standard-query-parser,Standard Query Parser>> to generate an boolean query since the default implementation "```method```" avoids scoring.
This can be useful for generating filter queries from the external human readable terms returned by the faceting or terms components, and may be more efficient in some cases than using the <<the-standard-query-parser.adoc#the-standard-query-parser,Standard Query Parser>> to generate an boolean query since the default implementation `method` avoids scoring.
This query parser takes the following parameters:
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`f`::
The field on which to search. This parameter is required.
[cols="30,70",options="header"]
|===
|Parameter |Description
|f |The field on which to search. Required.
|separator |Separator to use when parsing the input. If set to " " (a single blank space), will trim additional white space from the input terms. Defaults to "`,`".
|method |The internal query-building implementation: `termsFilter`, `booleanQuery`, `automaton`, or `docValuesTermsFilter`. Defaults to "```termsFilter```".
|===
`separator`::
Separator to use when parsing the input. If set to " " (a single blank space), will trim additional white space from the input terms. Defaults to a comma (`,`).
Examples:
`method`::
The internal query-building implementation: `termsFilter`, `booleanQuery`, `automaton`, or `docValuesTermsFilter`. Defaults to `termsFilter`.
*Examples*
[source,text]
----

View File

@ -43,15 +43,14 @@ Three rank queries are currently included in the Solr distribution. You can also
The `rerank` parser wraps a query specified by an local parameter, along with additional parameters indicating how many documents should be re-ranked, and how the final scores should be computed:
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`reRankQuery`::
The query string for your complex ranking query - in most cases <<local-parameters-in-queries.adoc#local-parameters-in-queries,a variable>> will be used to refer to another request parameter. This parameter is required.
[cols="20,15,65",options="header"]
|===
|Parameter |Default |Description
|`reRankQuery` |(Mandatory) |The query string for your complex ranking query - in most cases <<local-parameters-in-queries.adoc#local-parameters-in-queries,a variable>> will be used to refer to another request parameter.
|`reRankDocs` |200 |The number of top N documents from the original query that should be re-ranked. This number will be treated as a minimum, and may be increased internally automatically in order to rank enough documents to satisfy the query (ie: start+rows)
|`reRankWeight` |2.0 |A multiplicative factor that will be applied to the score from the reRankQuery for each of the top matching documents, before that score is added to the original score
|===
`reRankDocs`::
The number of top N documents from the original query that should be re-ranked. This number will be treated as a minimum, and may be increased internally automatically in order to rank enough documents to satisfy the query (i.e., start+rows). The default is `200`.
`reRankWeight`::
A multiplicative factor that will be applied to the score from the reRankQuery for each of the top matching documents, before that score is added to the original score. The default is `2.0`.
In the example below, the top 1000 documents matching the query "greetings" will be re-ranked using the query "(hi hello hey hiya)". The resulting scores for each of those 1000 documents will be 3 times their score from the "(hi hello hey hiya)", plus the score from the original "greetings" query:

View File

@ -1,6 +1,7 @@
= Schema API
:page-shortname: schema-api
:page-permalink: schema-api.html
:page-tocclass: right
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
@ -51,20 +52,18 @@ The base address for the API is `\http://<host>:<port>/solr/<collection_name>`.
bin/solr -e cloud -noprompt
----
[[SchemaAPI-APIEntryPoints]]
== API Entry Points
* `/schema`: <<SchemaAPI-RetrievetheEntireSchema,retrieve>> the schema, or <<SchemaAPI-ModifytheSchema,modify>> the schema to add, remove, or replace fields, dynamic fields, copy fields, or field types
* `/schema/fields`: <<SchemaAPI-ListFields,retrieve information>> about all defined fields or a specific named field
* `/schema/dynamicfields`: <<SchemaAPI-ListDynamicFields,retrieve information>> about all dynamic field rules or a specific named dynamic rule
* `/schema/fieldtypes`: <<SchemaAPI-ListFieldTypes,retrieve information>> about all field types or a specific field type
* `/schema/copyfields`: <<SchemaAPI-ListCopyFields,retrieve information>> about copy fields
* `/schema/name`: <<SchemaAPI-ShowSchemaName,retrieve>> the schema name
* `/schema/version`: <<SchemaAPI-ShowtheSchemaVersion,retrieve>> the schema version
* `/schema/uniquekey`: <<SchemaAPI-ListUniqueKey,retrieve>> the defined uniqueKey
* `/schema/similarity`: <<SchemaAPI-ShowGlobalSimilarity,retrieve>> the global similarity definition
* `/schema`: <<Retrieve the Entire Schema,retrieve>> the schema, or <<Modify the Schema,modify>> the schema to add, remove, or replace fields, dynamic fields, copy fields, or field types
* `/schema/fields`: <<List Fields,retrieve information>> about all defined fields or a specific named field
* `/schema/dynamicfields`: <<List Dynamic Fields,retrieve information>> about all dynamic field rules or a specific named dynamic rule
* `/schema/fieldtypes`: <<List Field Types,retrieve information>> about all field types or a specific field type
* `/schema/copyfields`: <<List Copy Fields,retrieve information>> about copy fields
* `/schema/name`: <<Show Schema Name,retrieve>> the schema name
* `/schema/version`: <<Show the Schema Version,retrieve>> the schema version
* `/schema/uniquekey`: <<List UniqueKey,retrieve>> the defined uniqueKey
* `/schema/similarity`: <<Show Global Similarity,retrieve>> the global similarity definition
[[SchemaAPI-ModifytheSchema]]
== Modify the Schema
`POST /_collection_/schema`
@ -89,7 +88,6 @@ In each case, the response will include the status and the time to process the r
When modifying the schema with the API, a core reload will automatically occur in order for the changes to be available immediately for documents indexed thereafter. Previously indexed documents will *not* be automatically handled - they *must* be re-indexed if they used schema elements that you changed.
[[SchemaAPI-AddaNewField]]
=== Add a New Field
The `add-field` command adds a new field definition to your schema. If a field with the same name exists an error is thrown.
@ -108,7 +106,6 @@ curl -X POST -H 'Content-type:application/json' --data-binary '{
}' http://localhost:8983/solr/gettingstarted/schema
----
[[SchemaAPI-DeleteaField]]
=== Delete a Field
The `delete-field` command removes a field definition from your schema. If the field does not exist in the schema, or if the field is the source or destination of a copy field rule, an error is thrown.
@ -122,7 +119,6 @@ curl -X POST -H 'Content-type:application/json' --data-binary '{
}' http://localhost:8983/solr/gettingstarted/schema
----
[[SchemaAPI-ReplaceaField]]
=== Replace a Field
The `replace-field` command replaces a field's definition. Note that you must supply the full definition for a field - this command will *not* partially modify a field's definition. If the field does not exist in the schema an error is thrown.
@ -141,7 +137,6 @@ curl -X POST -H 'Content-type:application/json' --data-binary '{
}' http://localhost:8983/solr/gettingstarted/schema
----
[[SchemaAPI-AddaDynamicFieldRule]]
=== Add a Dynamic Field Rule
The `add-dynamic-field` command adds a new dynamic field rule to your schema.
@ -160,7 +155,6 @@ curl -X POST -H 'Content-type:application/json' --data-binary '{
}' http://localhost:8983/solr/gettingstarted/schema
----
[[SchemaAPI-DeleteaDynamicFieldRule]]
=== Delete a Dynamic Field Rule
The `delete-dynamic-field` command deletes a dynamic field rule from your schema. If the dynamic field rule does not exist in the schema, or if the schema contains a copy field rule with a target or destination that matches only this dynamic field rule, an error is thrown.
@ -174,7 +168,6 @@ curl -X POST -H 'Content-type:application/json' --data-binary '{
}' http://localhost:8983/solr/gettingstarted/schema
----
[[SchemaAPI-ReplaceaDynamicFieldRule]]
=== Replace a Dynamic Field Rule
The `replace-dynamic-field` command replaces a dynamic field rule in your schema. Note that you must supply the full definition for a dynamic field rule - this command will *not* partially modify a dynamic field rule's definition. If the dynamic field rule does not exist in the schema an error is thrown.
@ -193,7 +186,6 @@ curl -X POST -H 'Content-type:application/json' --data-binary '{
}' http://localhost:8983/solr/gettingstarted/schema
----
[[SchemaAPI-AddaNewFieldType]]
=== Add a New Field Type
The `add-field-type` command adds a new field type to your schema.
@ -240,7 +232,6 @@ curl -X POST -H 'Content-type:application/json' --data-binary '{
}' http://localhost:8983/solr/gettingstarted/schema
----
[[SchemaAPI-DeleteaFieldType]]
=== Delete a Field Type
The `delete-field-type` command removes a field type from your schema. If the field type does not exist in the schema, or if any field or dynamic field rule in the schema uses the field type, an error is thrown.
@ -254,7 +245,6 @@ curl -X POST -H 'Content-type:application/json' --data-binary '{
}' http://localhost:8983/solr/gettingstarted/schema
----
[[SchemaAPI-ReplaceaFieldType]]
=== Replace a Field Type
The `replace-field-type` command replaces a field type in your schema. Note that you must supply the full definition for a field type - this command will *not* partially modify a field type's definition. If the field type does not exist in the schema an error is thrown.
@ -276,22 +266,21 @@ curl -X POST -H 'Content-type:application/json' --data-binary '{
}' http://localhost:8983/solr/gettingstarted/schema
----
[[SchemaAPI-AddaNewCopyFieldRule]]
=== Add a New Copy Field Rule
The `add-copy-field` command adds a new copy field rule to your schema.
The attributes supported by the command are the same as when creating copy field rules by manually editing the `schema.xml`, as below:
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`source`::
The source field. This parameter is required.
`dest`::
A field or an array of fields to which the source field will be copied. This parameter is required.
`maxChars`::
The upper limit for the number of characters to be copied. The section <<copying-fields.adoc#copying-fields,Copying Fields>> has more details.
[cols="20,20,60",options="header"]
|===
|Name |Required |Description
|source |Yes |The source field.
|dest |Yes |A field or an array of fields to which the source field will be copied.
|maxChars |No |The upper limit for the number of characters to be copied. The section <<copying-fields.adoc#copying-fields,Copying Fields>> has more details.
|===
For example, to define a rule to copy the field "shelf" to the "location" and "catchall" fields, you would POST the following request:
@ -304,7 +293,6 @@ curl -X POST -H 'Content-type:application/json' --data-binary '{
}' http://localhost:8983/solr/gettingstarted/schema
----
[[SchemaAPI-DeleteaCopyFieldRule]]
=== Delete a Copy Field Rule
The `delete-copy-field` command deletes a copy field rule from your schema. If the copy field rule does not exist in the schema an error is thrown.
@ -320,7 +308,6 @@ curl -X POST -H 'Content-type:application/json' --data-binary '{
}' http://localhost:8983/solr/gettingstarted/schema
----
[[SchemaAPI-MultipleCommandsinaSinglePOST]]
=== Multiple Commands in a Single POST
It is possible to perform one or more add requests in a single command. The API is transactional and all commands in a single call either succeed or fail together.
@ -387,7 +374,6 @@ curl -X POST -H 'Content-type:application/json' --data-binary '{
}' http://localhost:8983/solr/gettingstarted/schema
----
[[SchemaAPI-SchemaChangesamongReplicas]]
=== Schema Changes among Replicas
When running in SolrCloud mode, changes made to the schema on one node will propagate to all replicas in the collection.
@ -398,51 +384,39 @@ If agreement is not reached by all replicas in the specified time, then the requ
If you do not supply an `updateTimeoutSecs` parameter, the default behavior is for the receiving node to return immediately after persisting the updates to ZooKeeper. All other replicas will apply the updates asynchronously. Consequently, without supplying a timeout, your client application cannot be sure that all replicas have applied the changes.
[[SchemaAPI-RetrieveSchemaInformation]]
== Retrieve Schema Information
The following endpoints allow you to read how your schema has been defined. You can GET the entire schema, or only portions of it as needed.
To modify the schema, see the previous section <<SchemaAPI-ModifytheSchema,Modify the Schema>>.
To modify the schema, see the previous section <<Modify the Schema>>.
[[SchemaAPI-RetrievetheEntireSchema]]
=== Retrieve the Entire Schema
`GET /_collection_/schema`
[[SchemaAPI-INPUT]]
==== INPUT
==== Retrieve Schema Parameters
*Path Parameters*
[options="header",]
|===
|Key |Description
|collection |The collection (or core) name.
|===
`collection`::
The collection (or core) name.
*Query Parameters*
The query parameters should be added to the API request after '?'.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
[cols="10,20,10,10,50",options="header"]
|===
|Key |Type |Required |Default |Description
|wt |string |No |json |Defines the format of the response. The options are *json*, *xml* or *schema.xml*. If not specified, JSON will be returned by default.
|===
`wt`::
Defines the format of the response. The options are *json*, *xml* or *schema.xml*. If not specified, JSON will be returned by default.
[[SchemaAPI-OUTPUT]]
==== OUTPUT
==== Retrieve Schema Response
*Output Content*
The output will include all fields, field types, dynamic rules and copy field rules, in the format requested (JSON or XML). The schema name and version are also included.
[[SchemaAPI-EXAMPLES]]
==== EXAMPLES
==== Retrieve Schema Examples
Get the entire schema in JSON.
@ -589,52 +563,49 @@ curl http://localhost:8983/solr/gettingstarted/schema?wt=schema.xml
</schema>
----
[[SchemaAPI-ListFields]]
=== List Fields
`GET /_collection_/schema/fields`
`GET /_collection_/schema/fields/_fieldname_`
[[SchemaAPI-INPUT.1]]
==== INPUT
==== List Fields Parameters
*Path Parameters*
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`collection`::
The collection (or core) name.
[cols="30,70",options="header"]
|===
|Key |Description
|collection |The collection (or core) name.
|fieldname |The specific fieldname (if limiting request to a single field).
|===
`fieldname`::
The specific fieldname (if limiting the request to a single field).
*Query Parameters*
The query parameters can be added to the API request after a '?'.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`wt`::
Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default.
[cols="20,15,10,15,40",options="header"]
|===
|Key |Type |Required |Default |Description
|wt |string |No |json |Defines the format of the response. The options are *json* or *xml*. If not specified, JSON will be returned by default.
|fl |string |No |(all fields) |Comma- or space-separated list of one or more fields to return. If not specified, all fields will be returned by default.
|includeDynamic |boolean |No |false |If *true*, and if the *fl* query parameter is specified or the *fieldname* path parameter is used, matching dynamic fields are included in the response and identified with the *dynamicBase* property. If neither the *fl* query parameter nor the *fieldname* path parameter is specified, the *includeDynamic* query parameter is ignored. If *false*, matching dynamic fields will not be returned.
|showDefaults |boolean |No |false |If *true*, all default field properties from each field's field type will be included in the response (e.g. *tokenized* for `solr.TextField`). If *false*, only explicitly specified field properties will be included.
|===
`fl`::
Comma- or space-separated list of one or more fields to return. If not specified, all fields will be returned by default.
[[SchemaAPI-OUTPUT.1]]
==== OUTPUT
`includeDynamic`::
If `true`, and if the `fl` query parameter is specified or the `fieldname` path parameter is used, matching dynamic fields are included in the response and identified with the `dynamicBase` property.
+
If neither the `fl` query parameter nor the `fieldname` path parameter is specified, the `includeDynamic` query parameter is ignored.
+
If `false`, the default, matching dynamic fields will not be returned.
*Output Content*
`showDefaults`::
If `true`, all default field properties from each field's field type will be included in the response (e.g. `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included.
The output will include each field and any defined configuration for each field. The defined configuration can vary for each field, but will minimally include the field `name`, the `type`, if it is `indexed` and if it is `stored`. If `multiValued` is defined as either true or false (most likely true), that will also be shown. See the section <<defining-fields.adoc#defining-fields,Defining Fields>> for more information about each parameter.
==== List Fields Response
[[SchemaAPI-EXAMPLES.1]]
==== EXAMPLES
The output will include each field and any defined configuration for each field. The defined configuration can vary for each field, but will minimally include the field `name`, the `type`, if it is `indexed` and if it is `stored`.
If `multiValued` is defined as either true or false (most likely true), that will also be shown. See the section <<defining-fields.adoc#defining-fields,Defining Fields>> for more information about each parameter.
==== List Fields Examples
Get a list of all fields.
@ -677,48 +648,37 @@ The sample output below has been truncated to only show a few fields.
}
----
[[SchemaAPI-ListDynamicFields]]
=== List Dynamic Fields
`GET /_collection_/schema/dynamicfields`
`GET /_collection_/schema/dynamicfields/_name_`
[[SchemaAPI-INPUT.2]]
==== INPUT
==== List Dynamic Field Parameters
*Path Parameters*
[options="header",]
|===
|Key |Description
|collection |The collection (or core) name.
|name |The name of the dynamic field rule (if limiting request to a single dynamic field rule).
|===
`collection`::
The collection (or core) name.
`name`::
The name of the dynamic field rule (if limiting request to a single dynamic field rule).
*Query Parameters*
The query parameters can be added to the API request after a '?'.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`wt`::
Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default.
[cols="15,10,10,10,55",options="header"]
|===
|Key |Type |Required |Default |Description
|wt |string |No |json |Defines the format of the response. The options are *json,* *xml*. If not specified, JSON will be returned by default.
|showDefaults |boolean |No |false |If *true*, all default field properties from each dynamic field's field type will be included in the response (e.g. *tokenized* for `solr.TextField`). If *false*, only explicitly specified field properties will be included.
|===
`showDefaults`::
If `true`, all default field properties from each dynamic field's field type will be included in the response (e.g. `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included.
[[SchemaAPI-OUTPUT.2]]
==== OUTPUT
*Output Content*
==== List Dynamic Field Response
The output will include each dynamic field rule and the defined configuration for each rule. The defined configuration can vary for each rule, but will minimally include the dynamic field `name`, the `type`, if it is `indexed` and if it is `stored`. See the section <<dynamic-fields.adoc#dynamic-fields,Dynamic Fields>> for more information about each parameter.
[[SchemaAPI-EXAMPLES.2]]
==== EXAMPLES
==== List Dynamic Field Examples
Get a list of all dynamic field declarations:
@ -771,49 +731,37 @@ The sample output below has been truncated.
}
----
[[SchemaAPI-ListFieldTypes]]
=== List Field Types
`GET /_collection_/schema/fieldtypes`
`GET /_collection_/schema/fieldtypes/_name_`
[[SchemaAPI-INPUT.3]]
==== INPUT
==== List Field Type Parameters
*Path Parameters*
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`collection`::
The collection (or core) name.
[cols="30,70",options="header"]
|===
|Key |Description
|collection |The collection (or core) name.
|name |The name of the field type (if limiting request to a single field type).
|===
`name`::
The name of the field type (if limiting request to a single field type).
*Query Parameters*
The query parameters can be added to the API request after a '?'.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`wt`::
Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default.
[cols="15,10,10,10,55",options="header"]
|===
|Key |Type |Required |Default |Description
|wt |string |No |json |Defines the format of the response. The options are *json* or *xml*. If not specified, JSON will be returned by default.
|showDefaults |boolean |No |false |If *true*, all default field properties from each field type will be included in the response (e.g. *tokenized* for `solr.TextField`). If *false*, only explicitly specified field properties will be included.
|===
`showDefaults`::
If `true`, all default field properties from each dynamic field's field type will be included in the response (e.g. `tokenized` for `solr.TextField`). If `false`, the default, only explicitly specified field properties will be included.
[[SchemaAPI-OUTPUT.3]]
==== OUTPUT
*Output Content*
==== List Field Type Response
The output will include each field type and any defined configuration for the type. The defined configuration can vary for each type, but will minimally include the field type `name` and the `class`. If query or index analyzers, tokenizers, or filters are defined, those will also be shown with other defined parameters. See the section <<solr-field-types.adoc#solr-field-types,Solr Field Types>> for more information about how to configure various types of fields.
[[SchemaAPI-EXAMPLES.3]]
==== EXAMPLES
==== List Field Type Examples
Get a list of all field types.
@ -873,49 +821,37 @@ The sample output below has been truncated to show a few different field types f
}
----
[[SchemaAPI-ListCopyFields]]
=== List Copy Fields
`GET /_collection_/schema/copyfields`
[[SchemaAPI-INPUT.4]]
==== INPUT
==== List Copy Field Parameters
*Path Parameters*
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
[cols="30,70",options="header"]
|===
|Key |Description
|collection |The collection (or core) name.
|===
`collection`::
The collection (or core) name.
*Query Parameters*
The query parameters can be added to the API request after a '?'.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`wt`::
Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default.
[cols="15,10,10,10,55",options="header"]
|===
|Key |Type |Required |Default |Description
|wt |string |No |json |Defines the format of the response. The options are *json* or *xml*. If not specified, JSON will be returned by default.
|source.fl |string |No |(all source fields) |Comma- or space-separated list of one or more copyField source fields to include in the response - copyField directives with all other source fields will be excluded from the response. If not specified, all copyField-s will be included in the response.
|dest.fl |string |No |(all dest fields) |Comma- or space-separated list of one or more copyField dest fields to include in the response - copyField directives with all other dest fields will be excluded. If not specified, all copyField-s will be included in the response.
|===
`source.fl`::
Comma- or space-separated list of one or more copyField source fields to include in the response - copyField directives with all other source fields will be excluded from the response. If not specified, all copyField-s will be included in the response.
[[SchemaAPI-OUTPUT.4]]
==== OUTPUT
`dest.fl`::
Comma- or space-separated list of one or more copyField destination fields to include in the response. copyField directives with all other `dest` fields will be excluded. If not specified, all copyField-s will be included in the response.
*Output Content*
==== List Copy Field Response
The output will include the `source` and `dest`ination of each copy field rule defined in `schema.xml`. For more information about copying fields, see the section <<copying-fields.adoc#copying-fields,Copying Fields>>.
The output will include the `source` and `dest` (destination) of each copy field rule defined in `schema.xml`. For more information about copying fields, see the section <<copying-fields.adoc#copying-fields,Copying Fields>>.
[[SchemaAPI-EXAMPLES.4]]
==== EXAMPLES
==== List Copy Field Examples
Get a list of all copyfields.
Get a list of all copyFields.
[source,bash]
----
@ -952,44 +888,29 @@ The sample output below has been truncated to the first few copy definitions.
}
----
[[SchemaAPI-ShowSchemaName]]
=== Show Schema Name
`GET /_collection_/schema/name`
[[SchemaAPI-INPUT.5]]
==== INPUT
==== Show Schema Parameters
*Path Parameters*
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
[cols="30,70",options="header"]
|===
|Key |Description
|collection |The collection (or core) name.
|===
`collection`::
The collection (or core) name.
*Query Parameters*
The query parameters can be added to the API request after a '?'.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`wt`::
Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default.
[cols="15,10,10,10,55",options="header"]
|===
|Key |Type |Required |Default |Description
|wt |string |No |json |Defines the format of the response. The options are *json* or *xml*. If not specified, JSON will be returned by default.
|===
==== Show Schema Response
[[SchemaAPI-OUTPUT.5]]
==== OUTPUT
The output will be simply the name given to the schema.
*Output Content* The output will be simply the name given to the schema.
[[SchemaAPI-EXAMPLES.5]]
==== EXAMPLES
==== Show Schema Examples
Get the schema name.
@ -1007,46 +928,29 @@ curl http://localhost:8983/solr/gettingstarted/schema/name?wt=json
"name":"example"}
----
[[SchemaAPI-ShowtheSchemaVersion]]
=== Show the Schema Version
`GET /_collection_/schema/version`
[[SchemaAPI-INPUT.6]]
==== INPUT
==== Show Schema Version Parameters
*Path Parameters*
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
[cols="30,70",options="header"]
|===
|Key |Description
|collection |The collection (or core) name.
|===
collection::
The collection (or core) name.
*Query Parameters*
The query parameters can be added to the API request after a '?'.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`wt`::
Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default.
[cols="15,10,10,10,55",options="header"]
|===
|Key |Type |Required |Default |Description
|wt |string |No |json |Defines the format of the response. The options are *json* or *xml*. If not specified, JSON will be returned by default.
|===
[[SchemaAPI-OUTPUT.6]]
==== OUTPUT
*Output Content*
==== Show Schema Version Response
The output will simply be the schema version in use.
[[SchemaAPI-EXAMPLES.6]]
==== EXAMPLES
==== Show Schema Version Example
Get the schema version
@ -1064,46 +968,30 @@ curl http://localhost:8983/solr/gettingstarted/schema/version?wt=json
"version":1.5}
----
[[SchemaAPI-ListUniqueKey]]
=== List UniqueKey
`GET /_collection_/schema/uniquekey`
[[SchemaAPI-INPUT.7]]
==== INPUT
==== List UniqueKey Parameters
*Path Parameters*
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|`collection`::
The collection (or core) name.
[cols="30,70",options="header"]
|===
|Key |Description
|collection |The collection (or core) name.
|===
*Query Parameters*
The query parameters can be added to the API request after a '?'.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|`wt`::
Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default.
[cols="15,10,10,10,55",options="header"]
|===
|Key |Type |Required |Default |Description
|wt |string |No |json |Defines the format of the response. The options are *json* or *xml*. If not specified, JSON will be returned by default.
|===
[[SchemaAPI-OUTPUT.7]]
==== OUTPUT
*Output Content*
==== List UniqueKey Response
The output will include simply the field name that is defined as the uniqueKey for the index.
[[SchemaAPI-EXAMPLES.7]]
==== EXAMPLES
==== List UniqueKey Example
List the uniqueKey.
@ -1121,46 +1009,29 @@ curl http://localhost:8983/solr/gettingstarted/schema/uniquekey?wt=json
"uniqueKey":"id"}
----
[[SchemaAPI-ShowGlobalSimilarity]]
=== Show Global Similarity
`GET /_collection_/schema/similarity`
[[SchemaAPI-INPUT.8]]
==== INPUT
==== Show Global Similarity Parameters
*Path Parameters*
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
[cols="30,70",options="header"]
|===
|Key |Description
|collection |The collection (or core) name.
|===
`collection`::
The collection (or core) name.
*Query Parameters*
The query parameters can be added to the API request after a '?'.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`wt`::
Defines the format of the response. The options are `json` or `xml`. If not specified, JSON will be returned by default.
[cols="15,10,10,10,55",options="header"]
|===
|Key |Type |Required |Default |Description
|wt |string |No |json |Defines the format of the response. The options are *json* or *xml*. If not specified, JSON will be returned by default.
|===
[[SchemaAPI-OUTPUT.8]]
==== OUTPUT
*Output Content*
==== Show Global Similary Response
The output will include the class name of the global similarity defined (if any).
[[SchemaAPI-EXAMPLES.8]]
==== EXAMPLES
==== Show Global Similarity Example
Get the similarity implementation.
@ -1179,8 +1050,6 @@ curl http://localhost:8983/solr/gettingstarted/schema/similarity?wt=json
"class":"org.apache.solr.search.similarities.DefaultSimilarityFactory"}}
----
[[SchemaAPI-ManageResourceData]]
== Manage Resource Data
The <<managed-resources.adoc#managed-resources,Managed Resources>> REST API provides a mechanism for any Solr plugin to expose resources that should support CRUD (Create, Read, Update, Delete) operations. Depending on what Field Types and Analyzers are configured in your Schema, additional `/schema/` REST API paths may exist. See the <<managed-resources.adoc#managed-resources,Managed Resources>> section for more information and examples.

View File

@ -1,6 +1,7 @@
= Solr Control Script Reference
:page-shortname: solr-control-script-reference
:page-permalink: solr-control-script-reference.html
:page-tocclass: right
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
@ -26,10 +27,8 @@ You can find the script in the `bin/` directory of your Solr installation. The `
More examples of `bin/solr` in use are available throughout the Solr Reference Guide, but particularly in the sections <<running-solr.adoc#running-solr,Running Solr>> and <<getting-started-with-solrcloud.adoc#getting-started-with-solrcloud,Getting Started with SolrCloud>>.
[[SolrControlScriptReference-StartingandStopping]]
== Starting and Stopping
[[SolrControlScriptReference-StartandRestart]]
=== Start and Restart
The `start` command starts Solr. The `restart` command allows you to restart Solr while it is already running or if it has been stopped already.
@ -46,62 +45,106 @@ The `start` and `restart` commands have several options to allow you to run in S
When using the `restart` command, you must pass all of the parameters you initially passed when you started Solr. Behind the scenes, a stop request is initiated, so Solr will be stopped before being started again. If no nodes are already running, restart will skip the step to stop and proceed to starting Solr.
[[SolrControlScriptReference-AvailableParameters]]
==== Available Parameters
==== Start Parameters
The`bin/solr` script provides many options to allow you to customize the server in common ways, such as changing the listening port. However, most of the defaults are adequate for most Solr installations, especially when just getting started.
The `bin/solr` script provides many options to allow you to customize the server in common ways, such as changing the listening port. However, most of the defaults are adequate for most Solr installations, especially when just getting started.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`-a "<string>"`::
Start Solr with additional JVM parameters, such as those starting with -X. If you are passing JVM parameters that begin with "-D", you can omit the -a option.
[cols="20,40,40",options="header"]
|===
|Parameter |Description |Example
|-a "<string>" |Start Solr with additional JVM parameters, such as those starting with -X. If you are passing JVM parameters that begin with "-D", you can omit the -a option. |`bin/solr start -a "-Xdebug -Xrunjdwp:transport=dt_socket, server=y,suspend=n,address=1044"`
|-cloud a|
*Example*:
[source,bash]
bin/solr start -a "-Xdebug -Xrunjdwp:transport=dt_socket, server=y,suspend=n,address=1044"
`-cloud`::
Start Solr in SolrCloud mode, which will also launch the embedded ZooKeeper instance included with Solr.
+
This option can be shortened to simply `-c`.
+
If you are already running a ZooKeeper ensemble that you want to use instead of the embedded (single-node) ZooKeeper, you should also pass the -z parameter.
+
For more details, see the section <<SolrCloud Mode>> below.
+
*Example*: `bin/solr start -c`
For more details, see the section <<SolrControlScriptReference-SolrCloudMode,SolrCloud Mode>> below.
`-d <dir>`::
Define a server directory, defaults to `server` (as in, `$SOLR_HOME/server`). It is uncommon to override this option. When running multiple instances of Solr on the same host, it is more common to use the same server directory for each instance and use a unique Solr home directory using the -s option.
+
*Example*: `bin/solr start -d newServerDir`
|`bin/solr start -c`
|-d <dir> |Define a server directory, defaults to `server` (as in, `$SOLR_HOME/server`). It is uncommon to override this option. When running multiple instances of Solr on the same host, it is more common to use the same server directory for each instance and use a unique Solr home directory using the -s option. |`bin/solr start -d newServerDir`
|-e <name> a|
`-e <name>`::
Start Solr with an example configuration. These examples are provided to help you get started faster with Solr generally, or just try a specific feature.
+
The available options are:
* cloud
* techproducts
* dih
* schemaless
+
See the section <<SolrControlScriptReference-RunningwithExampleConfigurations,Running with Example Configurations>> below for more details on the example configurations.
|`bin/solr start -e schemaless`
|-f |Start Solr in the foreground; you cannot use this option when running examples with the -e option. |`bin/solr start -f`
|-h <hostname> |Start Solr with the defined hostname. If this is not specified, 'localhost' will be assumed. |`bin/solr start -h search.mysolr.com`
|-m <memory> |Start Solr with the defined value as the min (-Xms) and max (-Xmx) heap size for the JVM. |`bin/solr start -m 1g`
|-noprompt a|
+
*Example*: `bin/solr start -e schemaless`
`-f`::
Start Solr in the foreground; you cannot use this option when running examples with the -e option.
+
*Example*: `bin/solr start -f`
`-h <hostname>`::
Start Solr with the defined hostname. If this is not specified, 'localhost' will be assumed.
+
*Example*: `bin/solr start -h search.mysolr.com`
`-m <memory>`::
Start Solr with the defined value as the min (-Xms) and max (-Xmx) heap size for the JVM.
+
*Example*: `bin/solr start -m 1g`
`-noprompt`::
Start Solr and suppress any prompts that may be seen with another option. This would have the side effect of accepting all defaults implicitly.
+
For example, when using the "cloud" example, an interactive session guides you through several options for your SolrCloud cluster. If you want to accept all of the defaults, you can simply add the -noprompt option to your request.
+
*Example*: `bin/solr start -e cloud -noprompt`
|`bin/solr start -e cloud -noprompt`
|-p <port> |Start Solr on the defined port. If this is not specified, '8983' will be used. |`bin/solr start -p 8655`
|-s <dir> a|
`-p <port>`::
Start Solr on the defined port. If this is not specified, '8983' will be used.
+
*Example*: `bin/solr start -p 8655`
`-s <dir>`::
Sets the solr.solr.home system property; Solr will create core directories under this directory. This allows you to run multiple Solr instances on the same host while reusing the same server directory set using the -d parameter. If set, the specified directory should contain a solr.xml file, unless solr.xml exists in ZooKeeper. The default value is `server/solr`.
+
This parameter is ignored when running examples (-e), as the solr.solr.home depends on which example is run.
+
*Example*: `bin/solr start -s newHome`
|`bin/solr start -s newHome`
|-v |Be more verbose. This changes the logging level of log4j from `INFO` to `DEBUG`, having the same effect as if you edited `log4j.properties` accordingly. |`bin/solr start -f -v`
|-q |Be more quiet. This changes the logging level of log4j from `INFO` to `WARN`, having the same effect as if you edited `log4j.properties` accordingly. This can be useful in a production setting where you want to limit logging to warnings and errors. |`bin/solr start -f -q`
|-V |Start Solr with verbose messages from the start script. |`bin/solr start -V`
|-z <zkHost> |Start Solr with the defined ZooKeeper connection string. This option is only used with the -c option, to start Solr in SolrCloud mode. If this option is not provided, Solr will start the embedded ZooKeeper instance and use that instance for SolrCloud operations. |`bin/solr start -c -z server1:2181,server2:2181`
|-force |If attempting to start Solr as the root user, the script will exit with a warning that running Solr as "root" can cause problems. It is possible to override this warning with the -force parameter. |`sudo bin/solr start -force`
|===
`-v`::
Be more verbose. This changes the logging level of log4j from `INFO` to `DEBUG`, having the same effect as if you edited `log4j.properties` accordingly.
+
*Example*: `bin/solr start -f -v`
`-q`::
Be more quiet. This changes the logging level of log4j from `INFO` to `WARN`, having the same effect as if you edited `log4j.properties` accordingly. This can be useful in a production setting where you want to limit logging to warnings and errors.
+
*Example*: `bin/solr start -f -q`
`-V`::
Start Solr with verbose messages from the start script.
+
*Example*: `bin/solr start -V`
`-z <zkHost>`::
Start Solr with the defined ZooKeeper connection string. This option is only used with the -c option, to start Solr in SolrCloud mode. If this option is not provided, Solr will start the embedded ZooKeeper instance and use that instance for SolrCloud operations.
+
*Example*: `bin/solr start -c -z server1:2181,server2:2181`
`-force`::
If attempting to start Solr as the root user, the script will exit with a warning that running Solr as "root" can cause problems. It is possible to override this warning with the -force parameter.
+
*Example*: `sudo bin/solr start -force`
To emphasize how the default settings work take a moment to understand that the following commands are equivalent:
@ -111,7 +154,6 @@ To emphasize how the default settings work take a moment to understand that the
It is not necessary to define all of the options when starting if the defaults are fine for your needs.
[[SolrControlScriptReference-SettingJavaSystemProperties]]
==== Setting Java System Properties
The `bin/solr` script will pass any additional parameters that begin with `-D` to the JVM, which allows you to set arbitrary Java system properties.
@ -120,7 +162,6 @@ For example, to set the auto soft-commit frequency to 3 seconds, you can do:
`bin/solr start -Dsolr.autoSoftCommit.maxTime=3000`
[[SolrControlScriptReference-SolrCloudMode]]
==== SolrCloud Mode
The `-c` and `-cloud` options are equivalent:
@ -151,23 +192,34 @@ For more information about starting Solr in SolrCloud mode, see also the section
The example configurations allow you to get started quickly with a configuration that mirrors what you hope to accomplish with Solr.
Each example launches Solr with a managed schema, which allows use of the <<schema-api.adoc#schema-api,Schema API>> to make schema edits, but does not allow manual editing of a Schema file If you would prefer to manually modify a `schema.xml` file directly, you can change this default as described in the section <<schema-factory-definition-in-solrconfig.adoc#schema-factory-definition-in-solrconfig,Schema Factory Definition in SolrConfig>>.
Each example launches Solr with a managed schema, which allows use of the <<schema-api.adoc#schema-api,Schema API>> to make schema edits, but does not allow manual editing of a Schema file.
If you would prefer to manually modify a `schema.xml` file directly, you can change this default as described in the section <<schema-factory-definition-in-solrconfig.adoc#schema-factory-definition-in-solrconfig,Schema Factory Definition in SolrConfig>>.
Unless otherwise noted in the descriptions below, the examples do not enable <<solrcloud.adoc#solrcloud,SolrCloud>> nor <<schemaless-mode.adoc#schemaless-mode,schemaless mode>>.
The following examples are provided:
* *cloud*: This example starts a 1-4 node SolrCloud cluster on a single machine. When chosen, an interactive session will start to guide you through options to select the initial configset to use, the number of nodes for your example cluster, the ports to use, and name of the collection to be created. When using this example, you can choose from any of the available configsets found in `$SOLR_HOME/server/solr/configsets`.
* *techproducts*: This example starts Solr in standalone mode with a schema designed for the sample documents included in the `$SOLR_HOME/example/exampledocs` directory. The configset used can be found in `$SOLR_HOME/server/solr/configsets/sample_techproducts_configs`.
* *dih*: This example starts Solr in standalone mode with the DataImportHandler (DIH) enabled and several example `dataconfig.xml` files pre-configured for different types of data supported with DIH (such as, database contents, email, RSS feeds, etc.). The configset used is customized for DIH, and is found in `$SOLR_HOME/example/example-DIH/solr/conf`. For more information about DIH, see the section <<uploading-structured-data-store-data-with-the-data-import-handler.adoc#uploading-structured-data-store-data-with-the-data-import-handler,Uploading Structured Data Store Data with the Data Import Handler>>.
* *schemaless*: This example starts Solr in standalone mode using a managed schema, as described in the section <<schema-factory-definition-in-solrconfig.adoc#schema-factory-definition-in-solrconfig,Schema Factory Definition in SolrConfig>>, and provides a very minimal pre-defined schema. Solr will run in <<schemaless-mode.adoc#schemaless-mode,Schemaless Mode>> with this configuration, where Solr will create fields in the schema on the fly and will guess field types used in incoming documents. The configset used can be found in `$SOLR_HOME/server/solr/configsets/data_driven_schema_configs`.
* *cloud*: This example starts a 1-4 node SolrCloud cluster on a single machine. When chosen, an interactive session will start to guide you through options to select the initial configset to use, the number of nodes for your example cluster, the ports to use, and name of the collection to be created.
+
When using this example, you can choose from any of the available configsets found in `$SOLR_HOME/server/solr/configsets`.
* *techproducts*: This example starts Solr in standalone mode with a schema designed for the sample documents included in the `$SOLR_HOME/example/exampledocs` directory.
+
The configset used can be found in `$SOLR_HOME/server/solr/configsets/sample_techproducts_configs`.
* *dih*: This example starts Solr in standalone mode with the DataImportHandler (DIH) enabled and several example `dataconfig.xml` files pre-configured for different types of data supported with DIH (such as, database contents, email, RSS feeds, etc.).
+
The configset used is customized for DIH, and is found in `$SOLR_HOME/example/example-DIH/solr/conf`.
+
For more information about DIH, see the section <<uploading-structured-data-store-data-with-the-data-import-handler.adoc#uploading-structured-data-store-data-with-the-data-import-handler,Uploading Structured Data Store Data with the Data Import Handler>>.
* *schemaless*: This example starts Solr in standalone mode using a managed schema, as described in the section <<schema-factory-definition-in-solrconfig.adoc#schema-factory-definition-in-solrconfig,Schema Factory Definition in SolrConfig>>, and provides a very minimal pre-defined schema. Solr will run in <<schemaless-mode.adoc#schemaless-mode,Schemaless Mode>> with this configuration, where Solr will create fields in the schema on the fly and will guess field types used in incoming documents.
+
The configset used can be found in `$SOLR_HOME/server/solr/configsets/data_driven_schema_configs`.
[IMPORTANT]
====
The run in-foreground option (`-f`) is not compatible with the `-e` option since the script needs to perform additional tasks after starting the Solr server.
====
[[SolrControlScriptReference-Stop]]
=== Stop
The `stop` command sends a STOP request to a running Solr node, which allows it to shutdown gracefully. The command will wait up to 5 seconds for Solr to stop gracefully and then will forcefully kill the process (kill -9).
@ -176,23 +228,26 @@ The `stop` command sends a STOP request to a running Solr node, which allows it
`bin/solr stop -help`
[[SolrControlScriptReference-AvailableParameters.1]]
==== Available Parameters
==== Stop Parameters
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`-p <port>`::
Stop Solr running on the given port. If you are running more than one instance, or are running in SolrCloud mode, you either need to specify the ports in separate requests or use the -all option.
+
*Example*: `bin/solr stop -p 8983`
`-all`::
Stop all running Solr instances that have a valid PID.
+
*Example*: `bin/solr stop -all`
`-k <key>`::
Stop key used to protect from stopping Solr inadvertently; default is "solrrocks".
+
*Example*: `bin/solr stop -k solrrocks`
[cols="20,40,40",options="header"]
|===
|Parameter |Description |Example
|-p <port> |Stop Solr running on the given port. If you are running more than one instance, or are running in SolrCloud mode, you either need to specify the ports in separate requests or use the -all option. |`bin/solr stop -p 8983`
|-all |Stop all running Solr instances that have a valid PID. |`bin/solr stop -all`
|-k <key> |Stop key used to protect from stopping Solr inadvertently; default is "solrrocks". |`bin/solr stop -k solrrocks`
|===
[[SolrControlScriptReference-SystemInformation]]
== System Information
[[SolrControlScriptReference-Version]]
=== Version
The `version` command simply returns the version of Solr currently installed and immediately exists.
@ -203,7 +258,6 @@ $ bin/solr version
X.Y.0
----
[[SolrControlScriptReference-Status]]
=== Status
The `status` command displays basic JSON-formatted information for any Solr nodes found running on the local system.
@ -253,16 +307,17 @@ The `healthcheck` command generates a JSON-formatted health report for a collect
`bin/solr healthcheck -help`
[[SolrControlScriptReference-AvailableParameters.2]]
==== Available Parameters
==== Healthcheck Parameters
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`-c <collection>`::
Name of the collection to run a healthcheck against (required).
+
*Example*: `bin/solr healthcheck -c gettingstarted`
[cols="20,40,40",options="header"]
|===
|Parameter |Description |Example
|-c <collection> |Name of the collection to run a healthcheck against (required). |`bin/solr healthcheck -c gettingstarted`
|-z <zkhost> |ZooKeeper connection string, defaults to localhost:9983. If you are running Solr on a port other than 8983, you will have to specify the ZooKeeper connection string. By default, this will be the Solr port + 1000. |`bin/solr healthcheck -z localhost:2181`
|===
`-z <zkhost>`::
ZooKeeper connection string, defaults to `localhost:9983`. If you are running Solr on a port other than 8983, you will have to specify the ZooKeeper connection string. By default, this will be the Solr port + 1000.
+
*Example*: `bin/solr healthcheck -z localhost:2181`
Below is an example healthcheck request and response using a non-standard ZooKeeper connect string, with 2 nodes running:
@ -321,8 +376,7 @@ Below is an example healthcheck request and response using a non-standard ZooKee
The `bin/solr` script can also help you create new collections (in SolrCloud mode) or cores (in standalone mode), or delete collections.
[[SolrControlScriptReference-Create]]
=== Create
=== Create a Core or Collection
The `create` command detects the mode that Solr is running in (standalone or SolrCloud) and then creates a core or collection depending on the mode.
@ -330,65 +384,74 @@ The `create` command detects the mode that Solr is running in (standalone or Sol
`bin/solr create -help`
[[SolrControlScriptReference-AvailableParameters.3]]
==== Available Parameters
==== Create Core or Collection Parameters
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`-c <name>`::
Name of the core or collection to create (required).
+
*Example*: `bin/solr create -c mycollection`
[cols="20,40,40",options="header"]
|===
|Parameter |Description |Example
|-c <name> |Name of the core or collection to create (required). |`bin/solr create -c mycollection`
|-d <confdir> a|
`-d <confdir>`::
The configuration directory. This defaults to `data_driven_schema_configs`.
+
See the section <<Configuration Directories and SolrCloud>> below for more details about this option when running in SolrCloud mode.
+
*Example*: `bin/solr create -d basic_configs`
See the section <<SolrControlScriptReference-ConfigurationDirectoriesandSolrCloud,Configuration Directories and SolrCloud>> below for more details about this option when running in SolrCloud mode.
`-n <configName>`::
The configuration name. This defaults to the same name as the core or collection.
+
*Example*: `bin/solr create -n basic`
|`bin/solr create -d basic_configs`
|-n <configName> |The configuration name. This defaults to the same name as the core or collection. |`bin/solr create -n basic`
|-p <port> a|
`-p <port>`::
Port of a local Solr instance to send the create command to; by default the script tries to detect the port by looking for running Solr instances.
+
This option is useful if you are running multiple standalone Solr instances on the same host, thus requiring you to be specific about which instance to create the core in.
+
*Example*: `bin/solr create -p 8983`
|`bin/solr create -p 8983`
a|
-s <shards>
`-s <shards>` or `-shards`::
Number of shards to split a collection into, default is 1; only applies when Solr is running in SolrCloud mode.
+
*Example*: `bin/solr create -s 2`
-shards
`-rf <replicas>` or `-replicationFactor`::
Number of copies of each document in the collection. The default is 1 (no replication).
+
*Example*: `bin/solr create -rf 2`
|Number of shards to split a collection into, default is 1; only applies when Solr is running in SolrCloud mode. |`bin/solr create -s 2`
a|
-rf <replicas>
`-force`::
If attempting to run create as "root" user, the script will exit with a warning that running Solr or actions against Solr as "root" can cause problems. It is possible to override this warning with the -force parameter.
+
*Example*: `bin/solr create -c foo -force`
-replicationFactor
|Number of copies of each document in the collection. The default is 1 (no replication). |`bin/solr create -rf 2`
|-force |If attempting to run create as "root" user, the script will exit with a warning that running Solr or actions against Solr as "root" can cause problems. It is possible to override this warning with the -force parameter. |`bin/solr create -c foo -force`
|===
[[SolrControlScriptReference-ConfigurationDirectoriesandSolrCloud]]
==== Configuration Directories and SolrCloud
Before creating a collection in SolrCloud, the configuration directory used by the collection must be uploaded to ZooKeeper. The create command supports several use cases for how collections and configuration directories work. The main decision you need to make is whether a configuration directory in ZooKeeper should be shared across multiple collections.
Before creating a collection in SolrCloud, the configuration directory used by the collection must be uploaded to ZooKeeper. The `create` command supports several use cases for how collections and configuration directories work. The main decision you need to make is whether a configuration directory in ZooKeeper should be shared across multiple collections.
Let's work through a few examples to illustrate how configuration directories work in SolrCloud.
First, if you don't provide the `-d` or `-n` options, then the default configuration (`$SOLR_HOME/server/solr/configsets/data_driven_schema_configs/conf`) is uploaded to ZooKeeper using the same name as the collection. For example, the following command will result in the *data_driven_schema_configs* configuration being uploaded to `/configs/contacts` in ZooKeeper: `bin/solr create -c contacts`. If you create another collection, by doing `bin/solr create -c contacts2`, then another copy of the `data_driven_schema_configs` directory will be uploaded to ZooKeeper under `/configs/contacts2`. Any changes you make to the configuration for the contacts collection will not affect the contacts2 collection. Put simply, the default behavior creates a unique copy of the configuration directory for each collection you create.
First, if you don't provide the `-d` or `-n` options, then the default configuration (`$SOLR_HOME/server/solr/configsets/data_driven_schema_configs/conf`) is uploaded to ZooKeeper using the same name as the collection.
For example, the following command will result in the `data_driven_schema_configs` configuration being uploaded to `/configs/contacts` in ZooKeeper: `bin/solr create -c contacts`.
If you create another collection with `bin/solr create -c contacts2`, then another copy of the `data_driven_schema_configs` directory will be uploaded to ZooKeeper under `/configs/contacts2`.
Any changes you make to the configuration for the contacts collection will not affect the `contacts2` collection. Put simply, the default behavior creates a unique copy of the configuration directory for each collection you create.
You can override the name given to the configuration directory in ZooKeeper by using the `-n` option. For instance, the command `bin/solr create -c logs -d basic_configs -n basic` will upload the `server/solr/configsets/basic_configs/conf` directory to ZooKeeper as `/configs/basic`.
Notice that we used the `-d` option to specify a different configuration than the default. Solr provides several built-in configurations under `server/solr/configsets`. However you can also provide the path to your own configuration directory using the `-d` option. For instance, the command `bin/solr create -c mycoll -d /tmp/myconfigs`, will upload `/tmp/myconfigs` into ZooKeeper under `/configs/mycoll` . To reiterate, the configuration directory is named after the collection unless you override it using the `-n` option.
Notice that we used the `-d` option to specify a different configuration than the default. Solr provides several built-in configurations under `server/solr/configsets`. However you can also provide the path to your own configuration directory using the `-d` option. For instance, the command `bin/solr create -c mycoll -d /tmp/myconfigs`, will upload `/tmp/myconfigs` into ZooKeeper under `/configs/mycoll` .
To reiterate, the configuration directory is named after the collection unless you override it using the `-n` option.
Other collections can share the same configuration by specifying the name of the shared configuration using the `-n` option. For instance, the following command will create a new collection that shares the basic configuration created previously: `bin/solr create -c logs2 -n basic`.
[[SolrControlScriptReference-Data-drivenSchemaandSharedConfigurations]]
==== Data-driven Schema and Shared Configurations
The `data_driven_schema_configs` schema can mutate as data is indexed. Consequently, we recommend that you do not share data-driven configurations between collections unless you are certain that all collections should inherit the changes made when indexing data into one of the collections.
[[SolrControlScriptReference-Delete]]
=== Delete
=== Delete Core or Collection
The `delete` command detects the mode that Solr is running in (standalone or SolrCloud) and then deletes the specified core (standalone) or collection (SolrCloud) as appropriate.
@ -396,33 +459,33 @@ The `delete` command detects the mode that Solr is running in (standalone or Sol
`bin/solr delete -help`
If running in SolrCloud mode, the delete command checks if the configuration directory used by the collection you are deleting is being used by other collections. If not, then the configuration directory is also deleted from ZooKeeper. For example, if you created a collection by doing `bin/solr create -c contacts`, then the delete command `bin/solr delete -c contacts` will check to see if the `/configs/contacts` configuration directory is being used by any other collections. If not, then the `/configs/contacts` directory is removed from ZooKeeper.
If running in SolrCloud mode, the delete command checks if the configuration directory used by the collection you are deleting is being used by other collections. If not, then the configuration directory is also deleted from ZooKeeper.
[[SolrControlScriptReference-AvailableParameters.4]]
==== Available Parameters
For example, if you created a collection with `bin/solr create -c contacts`, then the delete command `bin/solr delete -c contacts` will check to see if the `/configs/contacts` configuration directory is being used by any other collections. If not, then the `/configs/contacts` directory is removed from ZooKeeper.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
==== Delete Core or Collection Parameters
[cols="20,40,40",options="header"]
|===
|Parameter |Description |Example
|-c <name> |Name of the core / collection to delete (required). |`bin/solr delete -c mycoll`
|-deleteConfig <true\|false> a|
Delete the configuration directory from ZooKeeper. The default is true.
`-c <name>`::
Name of the core / collection to delete (required).
+
*Example*: `bin/solr delete -c mycoll`
If the configuration directory is being used by another collection, then it will not be deleted even if you pass `-deleteConfig` as true.
|`bin/solr delete -deleteConfig false`
|-p <port> a|
`-deleteConfig`::
Whether or not the configuration directory should also be deleted from ZooKeeper. The default is `true`.
+
If the configuration directory is being used by another collection, then it will not be deleted even if you pass `-deleteConfig` as `true`.
+
*Example*: `bin/solr delete -deleteConfig false`
`-p <port>`::
The port of a local Solr instance to send the delete command to. By default the script tries to detect the port by looking for running Solr instances.
+
This option is useful if you are running multiple standalone Solr instances on the same host, thus requiring you to be specific about which instance to delete the core from.
|`bin/solr delete -p 8983`
|===
+
*Example*: `bin/solr delete -p 8983`
== Authentication
// TODO 6.6 check this whole section for accuracy
The `bin/solr` script allows enabling or disabling Basic Authentication, allowing you to configure authentication from the command line.
Currently, this script only enables Basic Authentication, and is only available when using SolrCloud mode.
@ -517,39 +580,38 @@ NOTE: Solr should have been started at least once before issuing these commands
Use the `zk upconfig` command to upload one of the pre-configured configuration set or a customized configuration set to ZooKeeper.
==== ZK Upload Parameters
[[SolrControlScriptReference-AvailableParameters_allparametersarerequired_]]
==== Available Parameters (all parameters are required)
All parameters below are required.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
[cols="20,40,40",options="header"]
|===
|Parameter |Description |Example
|-n <name> a|
`-n <name>`::
Name of the configuration set in ZooKeeper. This command will upload the configuration set to the "configs" ZooKeeper node giving it the name specified.
+
You can see all uploaded configuration sets in the Admin UI via the Cloud screens. Choose Cloud \-> Tree \-> configs to see them.
+
If a pre-existing configuration set is specified, it will be overwritten in ZooKeeper.
+
*Example*: `-n myconfig`
|`-n myconfig`
|-d <configset dir> a|
`-d <configset dir>`::
The path of the configuration set to upload. It should have a "conf" directory immediately below it that in turn contains solrconfig.xml etc.
+
If just a name is supplied, `$SOLR_HOME/server/solr/configsets` will be checked for this name. An absolute path may be supplied instead.
+
*Examples*:
a|
`-d directory_under_configsets`
* `-d directory_under_configsets`
* `-d /path/to/configset/source`
`-d /path/to/configset/source`
`-z <zkHost>`::
The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`.
+
*Example*: `-z 123.321.23.43:2181`
|-z <zkHost> |The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`. |`-z 123.321.23.43:2181`
|===
An example of this command with all of the parameters is:
An example of this command with these parameters is:
`bin/solr zk upconfig -z 111.222.333.444:2181 -n mynewconfig -d /path/to/configset`
[source,bash]
bin/solr zk upconfig -z 111.222.333.444:2181 -n mynewconfig -d /path/to/configset
.Reload Collections When Changing Configurations
[WARNING]
@ -562,57 +624,70 @@ This command does *not* automatically make changes effective! It simply uploads
Use the `zk downconfig` command to download a configuration set from ZooKeeper to the local filesystem.
==== ZK Download Parameters
[[SolrControlScriptReference-AvailableParameters_allparametersarerequired_.1]]
==== Available Parameters (all parameters are required)
All parameters listed below are required.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`-n <name>`::
Name of config set in ZooKeeper to download. The Admin UI Cloud \-> Tree \-> configs node lists all available configuration sets.
+
*Example*: `-n myconfig`
[cols="20,40,40",options="header"]
|===
|Parameter |Description |Example
|-n <name> |Name of config set in ZooKeeper to download. The Admin UI Cloud \-> Tree \-> configs node lists all available configuration sets. |`-n myconfig`
|-d <configset dir> a|
`-d <configset dir>`::
The path to write the downloaded configuration set into. If just a name is supplied, `$SOLR_HOME/server/solr/configsets` will be the parent. An absolute path may be supplied as well.
+
In either case, _pre-existing configurations at the destination will be overwritten!_
+
*Examples*:
a|
`-d directory_under_configsets`
* `-d directory_under_configsets`
* `-d /path/to/configset/destination`
`-d /path/to/configset/destination`
`-z <zkHost>`::
The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`.
+
*Example*: `-z 123.321.23.43:2181`
|-z <zkHost> |The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`. |`-z 123.321.23.43:2181`
|===
An example of this command with all parameters is:
An example of this command with the parameters is:
`bin/solr zk downconfig -z 111.222.333.444:2181 -n mynewconfig -d /path/to/configset`
[source,bash]
bin/solr zk downconfig -z 111.222.333.444:2181 -n mynewconfig -d /path/to/configset
A "best practice" is to keep your configuration sets in some form of version control as the system-of-record. In that scenario, `downconfig` should rarely be used.
[[SolrControlScriptReference-CopybetweenLocalFilesandZooKeeperznodes]]
=== Copy between Local Files and ZooKeeper znodes
Use the `zk cp` command for transferring files and directories between ZooKeeper znodes and your local drive. This command will copy from the local drive to ZooKeeper, from ZooKeeper to the local drive or from ZooKeeper to ZooKeeper.
[[SolrControlScriptReference-AvailableParameters.5]]
==== Available Parameters
==== ZK Copy Parameters
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`-r`::
Optional. Do a recursive copy. The command will fail if the <src> has children unless '-r' is specified.
+
*Example*: `-r`
[cols="20,40,40",options="header"]
|===
|Parameter |Description |Example
|-r |Optional. Do a recursive copy. The command will fail if the <src> has children unless '-r' is specified. |`-r`
|<src> |The file or path to copy from. If prefixed with `zk:` then the source is presumed to be ZooKeeper. If no prefix or the prefix is 'file:' this is the local drive. At least one of <src> or <dest> must be prefixed by `'zk:'` or the command will fail. a|
`zk:/configs/myconfigs/solrconfig.xml`
`<src>`::
The file or path to copy from. If prefixed with `zk:` then the source is presumed to be ZooKeeper. If no prefix or the prefix is 'file:' this is the local drive. At least one of <src> or <dest> must be prefixed by `'zk:'` or the command will fail.
+
*Examples*:
`file:/Users/apache/configs/src`
* `zk:/configs/myconfigs/solrconfig.xml`
* `file:/Users/apache/configs/src`
|<dest> |The file or path to copy to. If prefixed with `zk:` then the source is presumed to be ZooKeeper. If no prefix or the prefix is 'file:' this is the local drive. At least one of <src> or <dest> must be prefixed by `zk:` or the command will fail. If <dest> ends in a slash character it names a directory. |`zk:/configs/myconfigs/solrconfig.xml` `file:/Users/apache/configs/src`
|-z <zkHost> |The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`. |`-z 123.321.23.43:2181`
|===
`<dest>`::
The file or path to copy to. If prefixed with `zk:` then the source is presumed to be ZooKeeper. If no prefix or the prefix is `file:` this is the local drive.
+
At least one of `<src>` or `<dest>` must be prefixed by `zk:` or the command will fail. If `<dest>` ends in a slash character it names a directory.
+
*Examples*:
* `zk:/configs/myconfigs/solrconfig.xml`
* `file:/Users/apache/configs/src`
`-z <zkHost>`::
The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`.
+
*Example*: `-z 123.321.23.43:2181`
An example of this command with the parameters is:
@ -624,84 +699,88 @@ Copy a single file from ZooKeeper to local.
`bin/solr zk cp zk:/configs/myconf/managed_schema /configs/myconf/managed_schema -z 111.222.333.444:2181`
[[SolrControlScriptReference-RemoveaznodefromZooKeeper]]
=== Remove a znode from ZooKeeper
Use the `zk rm` command to remove a znode (and optionally all child nodes) from ZooKeeper
[[SolrControlScriptReference-AvailableParameters.6]]
==== Available Parameters
==== ZK Remove Parameters
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`-r`::
Optional. Do a recursive removal. The command will fail if the <path> has children unless '-r' is specified.
+
*Example*: `-r`
[cols="20,40,40",options="header"]
|===
|Parameter |Description |Example
|-r |Optional. Do a recursive removal. The command will fail if the <path> has children unless '-r' is specified. |`-r`
|<path> a|
`<path>`::
The path to remove from ZooKeeper, either a parent or leaf node.
+
There are limited safety checks, you cannot remove '/' or '/zookeeper' nodes.
+
The path is assumed to be a ZooKeeper node, no `zk:` prefix is necessary.
+
*Examples*:
a|
`/configs`
* `/configs`
* `/configs/myconfigset`
* `/configs/myconfigset/solrconfig.xml`
`/configs/myconfigset`
`-z <zkHost>`::
The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`.
+
*Example*: `-z 123.321.23.43:2181`
`/configs/myconfigset/solrconfig.xml`
|-z <zkHost> |The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`. |`-z 123.321.23.43:2181`
|===
An example of this command with the parameters is:
Examples of this command with the parameters are:
`bin/solr zk rm -r /configs`
`bin/solr zk rm /configs/myconfigset/schema.xml`
[[SolrControlScriptReference-MoveOneZooKeeperznodetoAnother_Rename_]]
=== Move One ZooKeeper znode to Another (Rename)
Use the `zk mv` command to move (rename) a ZooKeeper znode
[[SolrControlScriptReference-AvailableParameters.7]]
==== Available Parameters
==== ZK Move Parameters
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`<src>`::
The znode to rename. The `zk:` prefix is assumed.
+
*Example*: `/configs/oldconfigset`
[cols="20,40,40",options="header"]
|===
|Parameter |Description |Example
|<src> |The znode to rename. The `zk:` prefix is assumed. |`/configs/oldconfigset`
|<dest> |The new name of the znode. The `zk:` prefix is assumed. |`/configs/newconfigset`
|-z <zkHost> |The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`. |`-z 123.321.23.43:2181`
|===
`<dest>`::
The new name of the znode. The `zk:` prefix is assumed.
+
*Example*: `/configs/newconfigset`
`-z <zkHost>`::
The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`.
+
*Example*: `-z 123.321.23.43:2181`
An example of this command is:
`bin/solr zk mv /configs/oldconfigset /configs/newconfigset`
[[SolrControlScriptReference-ListaZooKeeperznode_sChildren]]
=== List a ZooKeeper znode's Children
Use the `zk ls` command to see the children of a znode.
[[SolrControlScriptReference-AvailableParameters.8]]
==== Available Parameters
==== ZK List Parameters
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`-r`
Optional. Recursively list all descendants of a znode.
+
*Example*: `-r`
[cols="20,40,40",options="header"]
|===
|Parameter |Description |Example
|-r |Optional. Recursively list all descendants of a znode. |`-r`
|<path> |The path on ZooKeeper to list. |`/collections/mycollection`
|-z <zkHost> |The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`. |`-z 123.321.23.43:2181`
|===
`<path>`::
The path on ZooKeeper to list.
+
*Example*: `/collections/mycollection`
`-z <zkHost>`::
The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`.
+
*Example*: `-z 123.321.23.43:2181`
An example of this command with the parameters is:
@ -716,16 +795,17 @@ An example of this command with the parameters is:
Use the `zk mkroot` command to create a znode. The primary use-case for this command to support ZooKeeper's "chroot" concept. However, it can also be used to create arbitrary paths.
[[SolrControlScriptReference-AvailableParameters.9]]
==== Available Parameters
==== Create znode Parameters
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
`<path>`::
The path on ZooKeeper to create. Intermediate znodes will be created if necessary. A leading slash is assumed even if not specified.
+
*Example*: `/solr`
[cols="20,40,40",options="header"]
|===
|Parameter |Description |Example
|<path> |The path on ZooKeeper to create. Intermediate znodes will be created if necessary. A leading slash is assumed even if not specified. |`/solr`
|-z <zkHost> |The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`. |`-z 123.321.23.43:2181`
|===
`-z <zkHost>`::
The ZooKeeper connection string. Unnecessary if ZK_HOST is defined in `solr.in.sh` or `solr.in.cmd`.
+
*Example*: `-z 123.321.23.43:2181`
Examples of this command: