SOLR-10574, SOLR-10272: Refguide documentation for _default configset

This commit is contained in:
Ishan Chattopadhyaya 2017-07-06 17:54:02 +05:30
parent 88b7ed1d46
commit 112bdda47e
9 changed files with 24 additions and 25 deletions

View File

@ -103,7 +103,7 @@ If you are on Windows machine, simply replace `zkcli.sh` with `zkcli.bat` in the
[source,bash] [source,bash]
---- ----
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd upconfig -confname my_new_config -confdir server/solr/configsets/basic_configs/conf ./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd upconfig -confname my_new_config -confdir server/solr/configsets/_default/conf
---- ----
[[CommandLineUtilities-BootstrapZooKeeperfromexistingSOLR_HOME]] [[CommandLineUtilities-BootstrapZooKeeperfromexistingSOLR_HOME]]

View File

@ -255,7 +255,7 @@ NOTE: If your operating system does not include cURL, you can download binaries
=== Create a SolrCloud Collection using bin/solr === Create a SolrCloud Collection using bin/solr
Create a 2-shard, replicationFactor=1 collection named mycollection using the default configset (data_driven_schema_configs): Create a 2-shard, replicationFactor=1 collection named mycollection using the default configset (_default):
.*nix command .*nix command
[source,bash] [source,bash]

View File

@ -89,7 +89,9 @@ Next, the script prompts you for the number of shards to distribute the collecti
Next, the script will prompt you for the number of replicas to create for each shard. <<shards-and-indexing-data-in-solrcloud.adoc#shards-and-indexing-data-in-solrcloud,Replication>> is covered in more detail later in the guide, so if you're unsure, then use the default of 2 so that you can see how replication is handled in SolrCloud. Next, the script will prompt you for the number of replicas to create for each shard. <<shards-and-indexing-data-in-solrcloud.adoc#shards-and-indexing-data-in-solrcloud,Replication>> is covered in more detail later in the guide, so if you're unsure, then use the default of 2 so that you can see how replication is handled in SolrCloud.
Lastly, the script will prompt you for the name of a configuration directory for your collection. You can choose *basic_configs*, *data_driven_schema_configs*, or *sample_techproducts_configs*. The configuration directories are pulled from `server/solr/configsets/` so you can review them beforehand if you wish. The *data_driven_schema_configs* configuration (the default) is useful when you're still designing a schema for your documents and need some flexiblity as you experiment with Solr. Lastly, the script will prompt you for the name of a configuration directory for your collection. You can choose *_default*, or *sample_techproducts_configs*. The configuration directories are pulled from `server/solr/configsets/` so you can review them beforehand if you wish. The *_default* configuration is useful when you're still designing a schema for your documents and need some flexiblity as you experiment with Solr, since it has schemaless functionality. However, after creating your collection, the schemaless functionality can be disabled in order to lock down the schema (so that documents indexed after doing so will not alter the schema) or to configure the schema by yourself. This can be done as follows (assuming your collection name is `mycollection`):
`curl http://host:8983/solr/mycollection/config -d '{"set-user-property": {"update.autoCreateFields":"false"}}'`
At this point, you should have a new collection created in your local SolrCloud cluster. To verify this, you can run the status command: At this point, you should have a new collection created in your local SolrCloud cluster. To verify this, you can run the status command:

View File

@ -29,7 +29,7 @@ For example, if you want several of your search handlers to return the same list
The properties and configuration of an `<initParams>` section mirror the properties and configuration of a request handler. It can include sections for defaults, appends, and invariants, the same as any request handler. The properties and configuration of an `<initParams>` section mirror the properties and configuration of a request handler. It can include sections for defaults, appends, and invariants, the same as any request handler.
For example, here is one of the `<initParams>` sections defined by default in the `data_driven_config` example: For example, here is one of the `<initParams>` sections defined by default in the `_default` example:
[source,xml] [source,xml]
---- ----

View File

@ -29,7 +29,7 @@ These Solr features, all controlled via `solrconfig.xml`, are:
[[SchemalessMode-UsingtheSchemalessExample]] [[SchemalessMode-UsingtheSchemalessExample]]
== Using the Schemaless Example == Using the Schemaless Example
The three features of schemaless mode are pre-configured in the `data_driven_schema_configs` <<config-sets.adoc#config-sets,config set>> in the Solr distribution. To start an example instance of Solr using these configs, run the following command: The three features of schemaless mode are pre-configured in the `_default` <<config-sets.adoc#config-sets,config set>> in the Solr distribution. To start an example instance of Solr using these configs, run the following command:
[source,bash] [source,bash]
---- ----
@ -67,15 +67,10 @@ You can use the `/schema/fields` <<schema-api.adoc#schema-api,Schema API>> to co
"uniqueKey":true}]} "uniqueKey":true}]}
---- ----
[TIP]
====
The `data_driven_schema_configs` configset includes a `copyField` directive that causes all content to be indexed in a predefined "catch-all" `\_text_` field, which is used to enable single-field search that includes all fields' content. This will cause the index to be larger than it would be without this "catch-all" `copyField`. When you nail down your schema, consider removing the `\_text_` field and the corresponding `copyField` directive if you don't need it.
====
[[SchemalessMode-ConfiguringSchemalessMode]] [[SchemalessMode-ConfiguringSchemalessMode]]
== Configuring Schemaless Mode == Configuring Schemaless Mode
As described above, there are three configuration elements that need to be in place to use Solr in schemaless mode. In the `data_driven_schema_configs` config set included with Solr these are already configured. If, however, you would like to implement schemaless on your own, you should make the following changes. As described above, there are three configuration elements that need to be in place to use Solr in schemaless mode. In the default (`_default`) config set included with Solr these are already configured. If, however, you would like to implement schemaless on your own, you should make the following changes.
[[SchemalessMode-EnableManagedSchema]] [[SchemalessMode-EnableManagedSchema]]
=== Enable Managed Schema === Enable Managed Schema
@ -190,7 +185,7 @@ After each of these changes have been made, Solr should be restarted (or, you ca
[[SchemalessMode-ExamplesofIndexedDocuments]] [[SchemalessMode-ExamplesofIndexedDocuments]]
== Examples of Indexed Documents == Examples of Indexed Documents
Once the schemaless mode has been enabled (whether you configured it manually or are using `data_driven_schema_configs` ), documents that include fields that are not defined in your schema should be added to the index, and the new fields added to the schema. Once the schemaless mode has been enabled (whether you configured it manually or are using `_default` ), documents that include fields that are not defined in your schema should be added to the index, and the new fields added to the schema.
For example, adding a CSV document will cause its fields that are not in the schema to be added, with fieldTypes based on values: For example, adding a CSV document will cause its fields that are not in the schema to be added, with fieldTypes based on values:

View File

@ -213,7 +213,7 @@ The configset used is customized for DIH, and is found in `$SOLR_HOME/example/ex
For more information about DIH, see the section <<uploading-structured-data-store-data-with-the-data-import-handler.adoc#uploading-structured-data-store-data-with-the-data-import-handler,Uploading Structured Data Store Data with the Data Import Handler>>. For more information about DIH, see the section <<uploading-structured-data-store-data-with-the-data-import-handler.adoc#uploading-structured-data-store-data-with-the-data-import-handler,Uploading Structured Data Store Data with the Data Import Handler>>.
* *schemaless*: This example starts Solr in standalone mode using a managed schema, as described in the section <<schema-factory-definition-in-solrconfig.adoc#schema-factory-definition-in-solrconfig,Schema Factory Definition in SolrConfig>>, and provides a very minimal pre-defined schema. Solr will run in <<schemaless-mode.adoc#schemaless-mode,Schemaless Mode>> with this configuration, where Solr will create fields in the schema on the fly and will guess field types used in incoming documents. * *schemaless*: This example starts Solr in standalone mode using a managed schema, as described in the section <<schema-factory-definition-in-solrconfig.adoc#schema-factory-definition-in-solrconfig,Schema Factory Definition in SolrConfig>>, and provides a very minimal pre-defined schema. Solr will run in <<schemaless-mode.adoc#schemaless-mode,Schemaless Mode>> with this configuration, where Solr will create fields in the schema on the fly and will guess field types used in incoming documents.
+ +
The configset used can be found in `$SOLR_HOME/server/solr/configsets/data_driven_schema_configs`. The configset used can be found in `$SOLR_HOME/server/solr/configsets/_default`.
[IMPORTANT] [IMPORTANT]
==== ====
@ -392,11 +392,11 @@ Name of the core or collection to create (required).
*Example*: `bin/solr create -c mycollection` *Example*: `bin/solr create -c mycollection`
`-d <confdir>`:: `-d <confdir>`::
The configuration directory. This defaults to `data_driven_schema_configs`. The configuration directory. This defaults to `_default`.
+ +
See the section <<Configuration Directories and SolrCloud>> below for more details about this option when running in SolrCloud mode. See the section <<Configuration Directories and SolrCloud>> below for more details about this option when running in SolrCloud mode.
+ +
*Example*: `bin/solr create -d basic_configs` *Example*: `bin/solr create -d _default`
`-n <configName>`:: `-n <configName>`::
The configuration name. This defaults to the same name as the core or collection. The configuration name. This defaults to the same name as the core or collection.
@ -431,15 +431,15 @@ Before creating a collection in SolrCloud, the configuration directory used by t
Let's work through a few examples to illustrate how configuration directories work in SolrCloud. Let's work through a few examples to illustrate how configuration directories work in SolrCloud.
First, if you don't provide the `-d` or `-n` options, then the default configuration (`$SOLR_HOME/server/solr/configsets/data_driven_schema_configs/conf`) is uploaded to ZooKeeper using the same name as the collection. First, if you don't provide the `-d` or `-n` options, then the default configuration (`$SOLR_HOME/server/solr/configsets/_default/conf`) is uploaded to ZooKeeper using the same name as the collection.
For example, the following command will result in the `data_driven_schema_configs` configuration being uploaded to `/configs/contacts` in ZooKeeper: `bin/solr create -c contacts`. For example, the following command will result in the `_default` configuration being uploaded to `/configs/contacts` in ZooKeeper: `bin/solr create -c contacts`.
If you create another collection with `bin/solr create -c contacts2`, then another copy of the `data_driven_schema_configs` directory will be uploaded to ZooKeeper under `/configs/contacts2`. If you create another collection with `bin/solr create -c contacts2`, then another copy of the `_default` directory will be uploaded to ZooKeeper under `/configs/contacts2`.
Any changes you make to the configuration for the contacts collection will not affect the `contacts2` collection. Put simply, the default behavior creates a unique copy of the configuration directory for each collection you create. Any changes you make to the configuration for the contacts collection will not affect the `contacts2` collection. Put simply, the default behavior creates a unique copy of the configuration directory for each collection you create.
You can override the name given to the configuration directory in ZooKeeper by using the `-n` option. For instance, the command `bin/solr create -c logs -d basic_configs -n basic` will upload the `server/solr/configsets/basic_configs/conf` directory to ZooKeeper as `/configs/basic`. You can override the name given to the configuration directory in ZooKeeper by using the `-n` option. For instance, the command `bin/solr create -c logs -d _default -n basic` will upload the `server/solr/configsets/_default/conf` directory to ZooKeeper as `/configs/basic`.
Notice that we used the `-d` option to specify a different configuration than the default. Solr provides several built-in configurations under `server/solr/configsets`. However you can also provide the path to your own configuration directory using the `-d` option. For instance, the command `bin/solr create -c mycoll -d /tmp/myconfigs`, will upload `/tmp/myconfigs` into ZooKeeper under `/configs/mycoll` . Notice that we used the `-d` option to specify a different configuration than the default. Solr provides several built-in configurations under `server/solr/configsets`. However you can also provide the path to your own configuration directory using the `-d` option. For instance, the command `bin/solr create -c mycoll -d /tmp/myconfigs`, will upload `/tmp/myconfigs` into ZooKeeper under `/configs/mycoll` .
@ -449,7 +449,9 @@ Other collections can share the same configuration by specifying the name of the
==== Data-driven Schema and Shared Configurations ==== Data-driven Schema and Shared Configurations
The `data_driven_schema_configs` schema can mutate as data is indexed. Consequently, we recommend that you do not share data-driven configurations between collections unless you are certain that all collections should inherit the changes made when indexing data into one of the collections. The `_default` schema can mutate as data is indexed, since it has schemaless functionality (i.e. data-driven changes to the schema). Consequently, we recommend that you do not share data-driven configurations between collections unless you are certain that all collections should inherit the changes made when indexing data into one of the collections. You can turn off schemaless functionality (i.e. data-driven changes to the schema) for a collection by the following (assuming the collection name is `mycollection`):
`curl http://host:8983/solr/mycollection/config -d '{"set-user-property": {"update.autoCreateFields":"false"}}'`
=== Delete Core or Collection === Delete Core or Collection

View File

@ -172,7 +172,7 @@ Here is the order in which the Solr Cell framework, using the Extracting Request
[[UploadingDatawithSolrCellusingApacheTika-ConfiguringtheSolrExtractingRequestHandler]] [[UploadingDatawithSolrCellusingApacheTika-ConfiguringtheSolrExtractingRequestHandler]]
== Configuring the Solr ExtractingRequestHandler == Configuring the Solr ExtractingRequestHandler
If you are not working with the supplied `sample_techproducts_configs `or` data_driven_schema_configs` <<config-sets.adoc#config-sets,config set>>, you must configure your own `solrconfig.xml` to know about the Jar's containing the `ExtractingRequestHandler` and its dependencies: If you are not working with the supplied `sample_techproducts_configs` or `_default` <<config-sets.adoc#config-sets,config set>>, you must configure your own `solrconfig.xml` to know about the Jar's containing the `ExtractingRequestHandler` and its dependencies:
[source,xml] [source,xml]
---- ----

View File

@ -31,7 +31,7 @@ These files are uploaded in either of the following cases:
When you try SolrCloud for the first time using the `bin/solr -e cloud`, the related configset gets uploaded to ZooKeeper automatically and is linked with the newly created collection. When you try SolrCloud for the first time using the `bin/solr -e cloud`, the related configset gets uploaded to ZooKeeper automatically and is linked with the newly created collection.
The below command would start SolrCloud with the default collection name (gettingstarted) and default configset (data_driven_schema_configs) uploaded and linked to it. The below command would start SolrCloud with the default collection name (gettingstarted) and default configset (_default) uploaded and linked to it.
[source,bash] [source,bash]
---- ----
@ -42,10 +42,10 @@ You can also explicitly upload a configuration directory when creating a collect
[source,bash] [source,bash]
---- ----
bin/solr create -c mycollection -d data_driven_schema_configs bin/solr create -c mycollection -d _default
---- ----
The create command will upload a copy of the `data_driven_schema_configs` configuration directory to ZooKeeper under `/configs/mycollection`. Refer to the <<solr-control-script-reference.adoc#solr-control-script-reference,Solr Control Script Reference>> page for more details about the create command for creating collections. The create command will upload a copy of the `_default` configuration directory to ZooKeeper under `/configs/mycollection`. Refer to the <<solr-control-script-reference.adoc#solr-control-script-reference,Solr Control Script Reference>> page for more details about the create command for creating collections.
Once a configuration directory has been uploaded to ZooKeeper, you can update them using the <<solr-control-script-reference.adoc#solr-control-script-reference,Solr Control Script>> Once a configuration directory has been uploaded to ZooKeeper, you can update them using the <<solr-control-script-reference.adoc#solr-control-script-reference,Solr Control Script>>

View File

@ -18,7 +18,7 @@
// specific language governing permissions and limitations // specific language governing permissions and limitations
// under the License. // under the License.
The VelocityResponseWriter is an optional plugin available in the `contrib/velocity` directory. It powers the /browse user interfaces when using configurations such as "basic_configs", "techproducts", and "example/files". The VelocityResponseWriter is an optional plugin available in the `contrib/velocity` directory. It powers the /browse user interfaces when using configurations such as "_default", "techproducts", and "example/files".
Its JAR and dependencies must be added (via `<lib>` or solr/home lib inclusion), and must be registered in `solrconfig.xml` like this: Its JAR and dependencies must be added (via `<lib>` or solr/home lib inclusion), and must be registered in `solrconfig.xml` like this: