lucene/solr/solr-ref-guide/src/major-changes-from-solr-5-t...

90 lines
6.6 KiB
Plaintext

= Major Changes from Solr 5 to Solr 6
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
There are some major changes in Solr 6 to consider before starting to migrate your configurations and indexes.
There are many hundreds of changes, so a thorough review of the <<solr-upgrade-notes.adoc#solr-upgrade-notes,Solr Upgrade Notes>> section as well as the {solr-javadocs}/changes/Changes.html[CHANGES.txt] file in your Solr instance will help you plan your migration to Solr 6. This section attempts to highlight some of the major changes you should be aware of.
== Highlights of New Features in Solr 6
Some of the major improvements in Solr 6 include:
[[major-5-6-streaming]]
=== Streaming Expressions
Introduced in Solr 5, <<streaming-expressions.adoc#streaming-expressions,Streaming Expressions>> allow querying Solr and getting results as a stream of data, sorted and aggregated as requested.
Several new expression types have been added in Solr 6:
* Parallel expressions using a MapReduce-like shuffling for faster throughput of high-cardinality fields.
* Daemon expressions to support continuous push or pull streaming.
* Advanced parallel relational algebra like distributed joins, intersections, unions and complements.
* Publish/Subscribe messaging.
* JDBC connections to pull data from other systems and join with documents in the Solr index.
[[major-5-6-parallel-sql]]
=== Parallel SQL Interface
Built on streaming expressions, new in Solr 6 is a <<parallel-sql-interface.adoc#parallel-sql-interface,Parallel SQL interface>> to be able to send SQL queries to Solr. SQL statements are compiled to streaming expressions on the fly, providing the full range of aggregations available to streaming expression requests. A JDBC driver is included, which allows using SQL clients and database visualization tools to query your Solr index and import data to other systems.
=== Cross Data Center Replication
Replication across data centers is now possible with <<cross-data-center-replication-cdcr.adoc#cross-data-center-replication-cdcr,Cross Data Center Replication>>. Using an active-passive model, a SolrCloud cluster can be replicated to another data center, and monitored with a new API.
=== Graph QueryParser
A new <<other-parsers.adoc#graph-query-parser,`graph` query parser>> makes it possible to to graph traversal queries of Directed (Cyclic) Graphs modelled using Solr documents.
[[major-5-6-docvalues]]
=== DocValues
Most non-text field types in the Solr sample configsets now default to using <<docvalues.adoc#docvalues,DocValues>>.
== Java 8 Required
The minimum supported version of Java for Solr 6 (and the <<using-solrj.adoc#using-solrj,SolrJ client libraries>>) is now Java 8.
== Index Format Changes
Solr 6 has no support for reading Lucene/Solr 4.x and earlier indexes. Be sure to run the Lucene `IndexUpgrader` included with Solr 5.5 if you might still have old 4x formatted segments in your index. Alternatively: fully optimize your index with Solr 5.5 to make sure it consists only of one up-to-date index segment.
== Managed Schema is now the Default
Solr's default behavior when a `solrconfig.xml` does not explicitly define a `<schemaFactory/>` is now dependent on the `luceneMatchVersion` specified in that `solrconfig.xml`. When `luceneMatchVersion < 6.0`, `ClassicIndexSchemaFactory` will continue to be used for back compatibility, otherwise an instance of <<schema-factory-definition-in-solrconfig.adoc#schema-factory-definition-in-solrconfig,`ManagedIndexSchemaFactory`>> will be used.
The most notable impacts of this change are:
* Existing `solrconfig.xml` files that are modified to use `luceneMatchVersion >= 6.0`, but do _not_ have an explicitly configured `ClassicIndexSchemaFactory`, will have their `schema.xml` file automatically upgraded to a `managed-schema` file.
* Schema modifications via the <<schema-api.adoc#schema-api,Schema API>> will now be enabled by default.
Please review the <<schema-factory-definition-in-solrconfig.adoc#schema-factory-definition-in-solrconfig,Schema Factory Definition in SolrConfig>> section for more details.
== Default Similarity Changes
Solr's default behavior when a Schema does not explicitly define a global <<other-schema-elements.adoc#other-schema-elements,`<similarity/>`>> is now dependent on the `luceneMatchVersion` specified in the `solrconfig.xml`. When `luceneMatchVersion < 6.0`, an instance of `ClassicSimilarityFactory` will be used, otherwise an instance of `SchemaSimilarityFactory` will be used. Most notably this change means that users can take advantage of per Field Type similarity declarations, without needing to also explicitly declare a global usage of `SchemaSimilarityFactory`.
Regardless of whether it is explicitly declared, or used as an implicit global default, `SchemaSimilarityFactory` 's implicit behavior when a Field Types do not declare an explicit `<similarity />` has also been changed to depend on the the `luceneMatchVersion`. When `luceneMatchVersion < 6.0`, an instance of `ClassicSimilarity` will be used, otherwise an instance of `BM25Similarity` will be used. A `defaultSimFromFieldType` init option may be specified on the `SchemaSimilarityFactory` declaration to change this behavior. Please review the `SchemaSimilarityFactory` javadocs for more details
== Replica & Shard Delete Command Changes
DELETESHARD and DELETEREPLICA now default to deleting the instance directory, data directory, and index directory for any replica they delete. Please review the <<collections-api.adoc#collections-api,Collection API>> documentation for details on new request parameters to prevent this behavior if you wish to keep all data on disk when using these commands
== facet.date.* Parameters Removed
The `facet.date` parameter (and associated `facet.date.*` parameters) that were deprecated in Solr 3.x have been removed completely. If you have not yet switched to using the equivalent <<faceting.adoc#faceting,`facet.range`>> functionality you must do so now before upgrading.