mirror of https://github.com/apache/lucene.git
7813 lines
340 KiB
Plaintext
7813 lines
340 KiB
Plaintext
Apache Solr Release Notes
|
||
|
||
Introduction
|
||
------------
|
||
Apache Solr is an open source enterprise search server based on the Apache Lucene Java
|
||
search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search,
|
||
caching, replication, and a web administration interface. It runs in a Java
|
||
servlet container such as Tomcat.
|
||
|
||
See http://lucene.apache.org/solr for more information.
|
||
|
||
|
||
Getting Started
|
||
---------------
|
||
You need a Java 1.7 VM or later installed.
|
||
In this release, there is an example Solr server including a bundled
|
||
servlet container in the directory named "example".
|
||
See the tutorial at http://lucene.apache.org/solr/tutorial.html
|
||
|
||
|
||
$Id$
|
||
|
||
================== 5.0.0 ==================
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.4
|
||
Carrot2 3.8.0
|
||
Velocity 1.7 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.4.5
|
||
|
||
Upgrading from Solr 4.x
|
||
----------------------
|
||
|
||
* The "file" attribute of infoStream in solrconfig.xml is removed. Control this
|
||
via your logging configuration (org.apache.solr.update.LoggingInfoStream) instead.
|
||
|
||
* UniqFieldsUpdateProcessorFactory no longer supports the <lst named="fields"> init
|
||
param style that was deprecated in Solr 4.5. If you are still using this syntax,
|
||
update your configs to use <arr name="fieldName"> instead. See SOLR-4249 for more
|
||
details.
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-4622: Hardcoded SolrCloud defaults for hostContext and hostPort that
|
||
were deprecated in 4.3 have been removed completely. (hossman)
|
||
|
||
* SOLR-4792: Stop shipping a .war. (Robert Muir)
|
||
|
||
================== 4.7.0 ==================
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.4
|
||
Carrot2 3.8.0
|
||
Velocity 1.7 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.4.5
|
||
|
||
Upgrading from Solr 4.6.0
|
||
----------------------
|
||
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* SOLR-5308: A new 'migrate' collection API to split all documents with a
|
||
route key into another collection (shalin)
|
||
|
||
* SOLR-5441: Expose number of transaction log files and their size via JMX.
|
||
(Rafał Kuć via shalin)
|
||
|
||
* SOLR-5320: Added support for tri-level compositeId routing.
|
||
(Anshum Gupta via shalin)
|
||
|
||
* SOLR-5287: You can edit files in the conf directory from the admin UI
|
||
(Erick Erickson, Stefan Matheis)
|
||
|
||
* SOLR-5447, SOLR-5490: Add a QParserPlugin for Lucene's SimpleQueryParser.
|
||
(Jack Conradson via shalin)
|
||
|
||
* SOLR-5446: Admin UI - Allow changing Schema and Config (steffkes)
|
||
|
||
* SOLR-5456: Admin UI - Allow creating new Files (steffkes)
|
||
|
||
* SOLR-5208: Support for the setting of core.properties key/values at create-time on
|
||
Collections API (Erick Erickson)
|
||
|
||
* SOLR-5428: New 'stats.calcdistinct' parameter in StatsComponent returns
|
||
set of distinct values and their count. This can also be specified per field
|
||
e.g. 'f.field.stats.calcdistinct'. (Elran Dvir via shalin)
|
||
|
||
* SOLR-5378: A new SuggestComponent that fully utilizes the Lucene suggester
|
||
module and adds pluggable dictionaries, payloads and better distributed support.
|
||
This is intended to eventually replace the Suggester support through the
|
||
SpellCheckComponent. (Areek Zillur, Varun Thacker via shalin)
|
||
|
||
* SOLR-5494: CoreContainer#remove throws NPE rather than returning null when
|
||
a SolrCore does not exist in core discovery mode. (Mark Miller)
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-5438: DebugComponent throws NPE when used with grouping.
|
||
(Tomás Fernández Löbbe via shalin)
|
||
|
||
* SOLR-5442: Python client cannot parse proxied response when served by Tomcat.
|
||
(Patrick Hunt, Gregory Chanan, Vamsee Yarlagadda, Romain Rigaux, Mark Miller)
|
||
|
||
* SOLR-5445: Proxied responses should propagate all headers rather than the
|
||
first one for each key. (Patrick Hunt, Mark Miller)
|
||
|
||
* SOLR-4612: Admin UI - Analysis Screen contains empty table-columns (steffkes)
|
||
|
||
* SOLR-5451: SyncStrategy closes it's http connection manager before the
|
||
executor that uses it in it's close method. (Mark Miller)
|
||
|
||
* SOLR-5460: SolrDispatchFilter#sendError can get a SolrCore that it does not
|
||
close. (Mark Miller)
|
||
|
||
* SOLR-5461: Request proxying should only set con.setDoOutput(true) if the
|
||
request is a post. (Mark Miller)
|
||
|
||
* SOLR-5479: SolrCmdDistributor retry logic stops if a leader for the request
|
||
cannot be found in 1 second. (Mark Miller)
|
||
|
||
* SOLR-5481: SolrCmdDistributor should not let the http client do it's own
|
||
retries. (Mark Miller)
|
||
|
||
* SOLR-4709: The core reload after replication if config files have changed
|
||
can fail due to a race condition. (Mark Miller, Hossman))
|
||
|
||
* LUCENE-5347: Fixed Solr's Zookeeper Client to copy files to Zookeeper using
|
||
binary transfer. Previously data was read with default encoding and stored
|
||
in zookeeper as UTF-8. This bug was found after upgrading to forbidden-apis
|
||
1.4. (Uwe Schindler)
|
||
|
||
* SOLR-4376: DataImportHandler uses wrong date format for last_index_time if
|
||
a delta-import is run first before any full-imports.
|
||
(Sebastien Lorber, Arcadius Ahouansou via shalin)
|
||
|
||
* SOLR-5496: We should share an http connection manager across non search
|
||
HttpClients and ensure all http connection managers get shutdown.
|
||
(Mark Miller)
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* SOLR-5458: Admin UI - Remove separated Pages for Config & Schema (steffkes)
|
||
|
||
* SOLR-5436: Eliminate the 1500ms wait in overseer loop as well as
|
||
polling the ZK distributed queue. (Noble Paul, Mark Miller)
|
||
|
||
Other Changes
|
||
---------------------
|
||
|
||
* SOLR-5399: Add distributed request tracking information to DebugComponent
|
||
(Tomás Fernández Löbbe via Ryan Ernst)
|
||
|
||
* SOLR-5421: Remove double set of distrib.from param in processAdd method of
|
||
DistributedUpdateProcessor. (Anshum Gupta via shalin)
|
||
|
||
* SOLR-5404: The example config references deprecated classes.
|
||
(Uwe Schindler, Rafał Kuć via Mark Miller)
|
||
|
||
* SOLR-5487: Replication factor error message doesn't match constraint.
|
||
(Patrick Hunt via shalin)
|
||
|
||
================== 4.6.0 ==================
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.4
|
||
Carrot2 3.8.0
|
||
Velocity 1.7 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.4.5
|
||
|
||
Upgrading from Solr 4.5.0
|
||
----------------------
|
||
|
||
* If you are using methods from FieldMutatingUpdateProcessorFactory for getting
|
||
configuration information (oneOrMany or getBooleanArg), those methods have
|
||
been moved to NamedList and renamed to removeConfigArgs and removeBooleanArg,
|
||
respectively. The original methods are deprecated, to be removed in 5.0.
|
||
See SOLR-5264.
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* SOLR-5167: Add support for AnalyzingInfixSuggester (AnalyzingInfixLookupFactory).
|
||
(Areek Zillur, Varun Thacker via Robert Muir)
|
||
|
||
* SOLR-5246: Shard splitting now supports collections configured with router.field.
|
||
(shalin)
|
||
|
||
* SOLR-5274: Allow JettySolrRunner SSL config to be specified via a constructor.
|
||
(Mark Miller)
|
||
|
||
* SOLR-5300: Shards can be split by specifying arbitrary number of hash ranges
|
||
within the shard's hash range. (shalin)
|
||
|
||
* SOLR-5226: Add Lucene index heap usage to the Solr admin UI.
|
||
(Areek Zillur via Robert Muir)
|
||
|
||
* SOLR-5324: Make sub shard replica recovery and shard state switch asynchronous.
|
||
(Yago Riveiro, shalin)
|
||
|
||
* SOLR-5338: Split shards by a route key using split.key parameter. (shalin)
|
||
|
||
* SOLR-5353: Enhance CoreAdmin api to split a route key's documents from an index
|
||
and leave behind all other documents. (shalin)
|
||
|
||
* SOLR-5027: CollapsingQParserPlugin for high performance field collapsing on high cardinality fields.
|
||
(Joel Bernstein)
|
||
|
||
* SOLR-5395: Added a RunAlways marker interface for UpdateRequestProcessorFactory
|
||
implementations indicating that they should not be removed in later stages
|
||
of distributed updates (usually signalled by the update.distrib parameter)
|
||
(yonik)
|
||
|
||
* SOLR-5310: Add a collection admin command to remove a replica (noble)
|
||
|
||
* SOLR-5311: Avoid registering replicas which are removed (noble)
|
||
|
||
* SOLR-5406: CloudSolrServer failed to propagate request parameters
|
||
along with delete updates. (yonik)
|
||
|
||
* SOLR-5374: Support user configured doc-centric versioning rules
|
||
via the optional DocBasedVersionConstraintsProcessorFactory
|
||
update processor (Hossman, yonik)
|
||
|
||
* SOLR-5392: Extend solrj apis to cover collection management.
|
||
(Roman Shaposhnik via Mark Miller)
|
||
|
||
* SOLR-5084: new field type EnumField. (Elran Dvir via Erick Erickson)
|
||
|
||
* SOLR-5464: Add option to ConcurrentSolrServer to stream pure delete
|
||
requests. (Mark Miller)
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-5216: Document updates to SolrCloud can cause a distributed deadlock.
|
||
(Mark Miller)
|
||
|
||
* SOLR-5367: Unmarshalling delete by id commands with JavaBin can lead to class cast
|
||
exception. (Mark Miller)
|
||
|
||
* SOLR-5359: ZooKeeper client is not closed when it fails to connect to an ensemble.
|
||
(Mark Miller, Klaus Herrmann)
|
||
|
||
* SOLR-5042: MoreLikeThisComponent was using the rows/count value in place of
|
||
flags, which caused a number of very strange issues, including NPEs and
|
||
ignoring requests for the results to include the score.
|
||
(Anshum Gupta, Mark Miller, Shawn Heisey)
|
||
|
||
* SOLR-5371: Solr should consistently call SolrServer#shutdown (Mark Miller)
|
||
|
||
* SOLR-5363: Solr doesn't start up properly with Log4J2 (Petar Tahchiev via Alan
|
||
Woodward)
|
||
|
||
* SOLR-5380: Using cloudSolrServer.setDefaultCollection(collectionId) does not
|
||
work as intended for an alias spanning more than 1 collection.
|
||
(Thomas Egense, Shawn Heisey, Mark Miller)
|
||
|
||
* SOLR-5418: Background merge after field removed from solr.xml causes error.
|
||
(Reported on user's list, Robert M's patch via Erick Erickson)
|
||
|
||
* SOLR-5318: Creating a core via the admin API doesn't respect transient property
|
||
(Olivier Soyez via Erick Erickson)
|
||
|
||
* SOLR-5388: Creating a new core via the HTTP API that results in a transient being
|
||
unloaded results in a " Too many close [count:-1]" error.
|
||
(Olivier Soyez via Erick Erickson)
|
||
|
||
* SOLR-5453: Raise recovery socket read timeouts. (Mark Miller)
|
||
|
||
* SOLR-5397: Replication can fail silently in some cases. (Mark Miller)
|
||
|
||
* SOLR-5465: SolrCmdDistributor retry logic has a concurrency race bug.
|
||
(Mark Miller)
|
||
|
||
* SOLR-5452: Do not attempt to proxy internal update requests. (Mark Miller)
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* SOLR-5232: SolrCloud should distribute updates via streaming rather than buffering.
|
||
(Mark Miller)
|
||
|
||
* SOLR-5223: SolrCloud should use the JavaBin binary format for communication by default.
|
||
(Mark Miller)
|
||
|
||
* SOLR-5370: Requests to recover when an update fails should be done in
|
||
background threads. (Mark Miller)
|
||
|
||
* LUCENE-5300,LUCENE-5304: Specialized faceting for fields which are declared as
|
||
multi-valued in the schema but are actually single-valued. (Adrien Grand)
|
||
|
||
Security
|
||
----------------------
|
||
|
||
* SOLR-4882: SolrResourceLoader was restricted to only allow access to resource
|
||
files below the instance dir. The reason for this is security related: Some
|
||
Solr components allow to pass in resource paths via REST parameters
|
||
(e.g. XSL stylesheets, velocity templates,...) and load them via resource
|
||
loader. For backwards compatibility, this security feature can be disabled
|
||
by a new system property: solr.allow.unsafe.resourceloading=true
|
||
(Uwe Schindler)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-5237: Add indexHeapUsageBytes to LukeRequestHandler, indicating how much
|
||
heap memory is being used by the underlying Lucene index structures.
|
||
(Areek Zillur via Robert Muir)
|
||
|
||
* SOLR-5241: Fix SimplePostToolTest performance problem - implicit DNS lookups
|
||
(hossman)
|
||
|
||
* SOLR-5273: Update HttpComponents to 4.2.5 and 4.2.6. (Mark Miller)
|
||
|
||
* SOLR-5264: Move methods for getting config information from
|
||
FieldMutatingUpdateProcessorFactory to NamedList. (Shawn Heisey)
|
||
|
||
* SOLR-5319: Remove unused and incorrect router name from Collection ZK nodes.
|
||
(Jessica Cheng via shalin)
|
||
|
||
* SOLR-5321: Remove unnecessary code in Overseer.updateState method which tries to
|
||
use router name from message where none is ever sent. (shalin)
|
||
|
||
* SOLR-5401: SolrResourceLoader logs a warning if a deprecated (factory) class
|
||
is used in schema or config. (Uwe Schindler)
|
||
|
||
* SOLR-3397: Warn if master or slave replication is enabled in SolrCloud mode. (Erick
|
||
Erickson)
|
||
|
||
================== 4.5.1 ==================
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.4
|
||
Carrot2 3.8.0
|
||
Velocity 1.7 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.4.5
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-4590: Collections API should return a nice error when not in SolrCloud mode.
|
||
(Anshum Gupta, Mark Miller)
|
||
|
||
* SOLR-5295: The CREATESHARD collection API creates maxShardsPerNode number of
|
||
replicas if replicationFactor is not specified. (Brett Hoerner, shalin)
|
||
|
||
* SOLR-5296: Creating a collection with implicit router adds shard ranges
|
||
to each shard. (shalin)
|
||
|
||
* SOLR-5263: Fix CloudSolrServer URL cache update race. (Jessica Cheng, Mark Miller)
|
||
|
||
* SOLR-5297: Admin UI - Threads Screen missing Icon (steffkes)
|
||
|
||
* SOLR-5301: DELETEALIAS command prints CREATEALIAS in logs (janhoy)
|
||
|
||
* SOLR-5255: Remove unnecessary call to fetch and watch live nodes in ZkStateReader
|
||
cluster watcher. (Jessica Cheng via shalin)
|
||
|
||
* SOLR-5305: Admin UI - Reloading System-Information on Dashboard does not work
|
||
anymore (steffkes)
|
||
|
||
* SOLR-5314: Shard split action should use soft commits instead of hard commits
|
||
to make sub shard data visible. (Kalle Aaltonen, shalin)
|
||
|
||
* SOLR-5327: SOLR-4915, "The root cause should be returned to the user when a SolrCore create
|
||
call fails", was reverted. (Mark Miller)
|
||
|
||
* SOLR-5317: SolrCore persistence bugs if defining SolrCores in solr.xml.
|
||
(Mark Miller, Yago Riveiro)
|
||
|
||
* SOLR-5306: Extra collection creation parameters like collection.configName are
|
||
not being respected. (Mark Miller, Liang Tianyu, Nathan Neulinger)
|
||
|
||
* SOLR-5325: ZooKeeper connection loss can cause the Overseer to stop processing
|
||
commands. (Christine Poerschke, Mark Miller, Jessica Cheng)
|
||
|
||
* SOLR-4327: HttpSolrServer can leak connections on errors. (Karl Wright, Mark Miller)
|
||
|
||
* SOLR-5349: CloudSolrServer - ZK timeout arguments passed to ZkStateReader are flipped.
|
||
(Ricardo Merizalde via shalin)
|
||
|
||
* SOLR-5330: facet.method=fcs on single values fields could sometimes result
|
||
in incorrect facet labels. (Michael Froh, yonik)
|
||
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-5323: Disable ClusteringComponent by default in collection1 example.
|
||
The solr.clustering.enabled system property needs to be set to 'true'
|
||
to enable the clustering contrib (reverts SOLR-4708). (Dawid Weiss)
|
||
|
||
================== 4.5.0 ==================
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.4
|
||
Carrot2 3.8.0
|
||
Velocity 1.7 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.4.5
|
||
|
||
Upgrading from Solr 4.4.0
|
||
----------------------
|
||
|
||
* XML configuration parsing is now more strict about situations where a single
|
||
setting is allowed but multiple values are found. In the past, one value
|
||
would be chosen arbitrarily and silently. Starting with 4.5, configuration
|
||
parsing will fail with an error in situations like this. If you see error
|
||
messages such as "solrconfig.xml contains more than one value for config path:
|
||
XXXXX" or "Found Z configuration sections when at most 1 is allowed matching
|
||
expression: XXXXX" check your solrconfig.xml file for multiple occurrences of
|
||
XXXXX and delete the ones that you do not wish to use. See SOLR-4953 &
|
||
SOLR-5108 for more details.
|
||
|
||
* In the past, schema.xml parsing would silently ignore "default" or "required"
|
||
options specified on <dynamicField/> declarations. Begining with 4.5, attempting
|
||
to do configured these on a dynamic field will cause an init error. If you
|
||
encounter one of these errors when upgrading an existing schema.xml, you can
|
||
safely remove these attributes, regardless of their value, from your config and
|
||
Solr will continue to bahave exactly as it did in previous versions. See
|
||
SOLR-5227 for more details.
|
||
|
||
* The UniqFieldsUpdateProcessorFactory has been improved to support all of the
|
||
FieldMutatingUpdateProcessorFactory selector options. The <lst named="fields">
|
||
init param option is now deprecated and should be replaced with the more standard
|
||
<arr name="fieldName">. See SOLR-4249 for more details.
|
||
|
||
* UpdateRequestExt has been removed as part of SOLR-4816. You should use UpdateRequest
|
||
instead.
|
||
|
||
* CloudSolrServer can now use multiple threads to add documents by default. This is a
|
||
small change in runtime semantics when using the bulk add method - you will still
|
||
end up with the same exception on a failure, but some documents beyond the one that
|
||
failed may have made it in. To get the old, single threaded behavior, set parallel updates
|
||
to false on the CloudSolrServer instance.
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* SOLR-5219: Rewritten selection of the default search and document clustering
|
||
algorithms. (Dawid Weiss)
|
||
|
||
* SOLR-5202: Support easier overrides of Carrot2 clustering attributes via
|
||
XML data sets exported from the Workbench. (Dawid Weiss)
|
||
|
||
* SOLR-5126: Update Carrot2 clustering to version 3.8.0, update Morfologik
|
||
to version 1.7.1 (Dawid Weiss)
|
||
|
||
* SOLR-2345: Enhanced geodist() to work with an RPT field, provided that the
|
||
field is referenced via 'sfield' and the query point is constant.
|
||
(David Smiley)
|
||
|
||
* SOLR-5082: The encoding of URL-encoded query parameters can be changed with
|
||
the "ie" (input encoding) parameter, e.g. "select?q=m%FCller&ie=ISO-8859-1".
|
||
The default is UTF-8. To change the encoding of POSTed content, use the
|
||
"Content-Type" HTTP header. (Uwe Schindler, Shawn Heisey)
|
||
|
||
* SOLR-4221: Custom sharding (Noble Paul)
|
||
|
||
* SOLR-4808: Persist and use router,replicationFactor and maxShardsPerNode at Collection
|
||
and Shard level (Noble Paul, Shalin Mangar)
|
||
|
||
* SOLR-5006: CREATESHARD command for 'implicit' shards (Noble Paul)
|
||
|
||
* SOLR-5017: Allow sharding based on the value of a field (Noble Paul)
|
||
|
||
* SOLR-4222: create custom sharded collection via collections API (Noble Paul)
|
||
|
||
* SOLR-4718: Allow solr.xml to be stored in ZooKeeper. (Mark Miller, Erick Erickson)
|
||
|
||
* SOLR-5156: Enhance ZkCLI to allow uploading of arbitrary files to ZK. (Erick Erickson)
|
||
|
||
* SOLR-5165: Single-valued docValues fields no longer require a default value.
|
||
Additionally they work with sortMissingFirst, sortMissingLast, facet.missing,
|
||
exists() in function queries, etc. (Robert Muir)
|
||
|
||
* SOLR-5182: Add NoOpRegenerator, a regenerator for custom per-segment caches
|
||
where items are preserved across commits. (Robert Muir)
|
||
|
||
* SOLR-4249: UniqFieldsUpdateProcessorFactory now extends
|
||
FieldMutatingUpdateProcessorFactory and supports all of it's selector options. Use
|
||
of the "fields" init param is now deprecated in favor of "fieldName" (hossman)
|
||
|
||
* SOLR-2548: Allow multiple threads to be specified for faceting. When threading, one
|
||
can specify facet.threads to parallelize loading the uninverted fields. In at least
|
||
one extreme case this reduced warmup time from 20 seconds to 3 seconds. (Janne Majaranta,
|
||
Gun Akkor via Erick Erickson, David Smiley)
|
||
|
||
* SOLR-4816: CloudSolrServer can now route updates locally and no longer relies on inter-node
|
||
update forwarding. (Joel Bernstein, Shikhar Bhushan, Stephen Riesenberg, Mark Miller)
|
||
|
||
* SOLR-3249: Allow CloudSolrServer and SolrCmdDistributor to use JavaBin. (Mark Miller)
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-3633: web UI reports an error if CoreAdminHandler says there are no
|
||
SolrCores (steffkes)
|
||
|
||
* SOLR-4489: SpellCheckComponent can throw StringIndexOutOfBoundsException
|
||
when generating collations involving multiple word-break corrections.
|
||
(James Dyer)
|
||
|
||
* SOLR-5107: Fixed NPE when using numTerms=0 in LukeRequestHandler
|
||
(Ahmet Arslan, hossman)
|
||
|
||
* SOLR-4679, SOLR-4908, SOLR-5124: Text extracted from HTML or PDF files
|
||
using Solr Cell was missing ignorable whitespace, which is inserted by
|
||
TIKA for convenience to support plain text extraction without using the
|
||
HTML elements. This bug resulted in glued words. (hossman, Uwe Schindler)
|
||
|
||
* SOLR-5121: zkcli usage help for makepath doesn't match actual command.
|
||
(Daniel Collins via Mark Miller)
|
||
|
||
* SOLR-5119: Managed schema problems after adding fields via Schema Rest API.
|
||
(Nils Kübler, Steve Rowe)
|
||
|
||
* SOLR-5133: HdfsUpdateLog can fail to close a FileSystem instance if init
|
||
is called more than once. (Mark Miller)
|
||
|
||
* SOLR-5135: Harden Collection API deletion of /collections/$collection
|
||
ZooKeeper node. (Mark Miller)
|
||
|
||
* SOLR-4764: When using NRT, just init the first reader from IndexWriter.
|
||
(Robert Muir, Mark Miller)
|
||
|
||
* SOLR-5122: Fixed bug in spellcheck.collateMaxCollectDocs. Eliminates risk
|
||
of divide by zero, and makes estimated hit counts meaningful in non-optimized
|
||
indexes. (hossman)
|
||
|
||
* SOLR-3936: Fixed QueryElevationComponent sorting when used with Grouping
|
||
(Michael Garski via hossman)
|
||
|
||
* SOLR-5171: SOLR Admin gui works in IE9, breaks in IE10. (Joseph L Howard via
|
||
steffkes)
|
||
|
||
* SOLR-5174: Admin UI - Query View doesn't highlight (json) Result if it
|
||
contains HTML Tags (steffkes)
|
||
|
||
* SOLR-4817 Solr should not fall back to the back compat built in solr.xml in SolrCloud
|
||
mode (Erick Erickson)
|
||
|
||
* SOLR-5112: Show full message in Admin UI Logging View (Matthew Keeney via
|
||
steffkes)
|
||
|
||
* SOLR-5190: SolrEntityProcessor substitutes variables only once in child entities
|
||
(Harsh Chawla, shalin)
|
||
|
||
* SOLR-3852: Fixed ZookeeperInfoServlet so that the SolrCloud Admin UI pages will
|
||
work even if ZK contains nodes with data which are not utf8 text. (hossman)
|
||
|
||
* SOLR-5206: Fixed OpenExchangeRatesOrgProvider to use refreshInterval correctly
|
||
(Catalin, hossman)
|
||
|
||
* SOLR-5215: Fix possibility of deadlock in ZooKeeper ConnectionManager.
|
||
(Mark Miller, Ricardo Merizalde)
|
||
|
||
* SOLR-4909: Use DirectoryReader.openIfChanged in non-NRT mode.
|
||
(Michael Garski via Robert Muir)
|
||
|
||
* SOLR-5227: Correctly fail schema initalization if a dynamicField is configured to
|
||
be required, or have a default value. (hossman)
|
||
|
||
* SOLR-5231: Fixed a bug with the behavior of BoolField that caused documents w/o
|
||
a value for the field to act as if the value were true in functions if no other
|
||
documents in the same index segment had a value of true.
|
||
(Robert Muir, hossman, yonik)
|
||
|
||
* SOLR-5233: The "deleteshard" collections API doesn't wait for cluster state to update,
|
||
can fail if some nodes of the deleted shard were down and had incorrect logging.
|
||
(Christine Poerschke, shalin)
|
||
|
||
* SOLR-5150: HdfsIndexInput may not fully read requested bytes. (Mark Miller, Patrick Hunt)
|
||
|
||
* SOLR-5240: All solr cores will now be loaded in parallel (as opposed to a fixed number)
|
||
in zookeeper mode to avoid deadlocks due to replicas waiting for other replicas
|
||
to come up. (yonik)
|
||
|
||
* SOLR-5243: Killing a shard in one collection can result in leader election in a different
|
||
collection if they share the same coreNodeName. (yonik, Mark Miller)
|
||
|
||
* SOLR-5281: IndexSchema log message was printing '[null]' instead of
|
||
'[<core name>]' (Jun Ohtani via Steve Rowe)
|
||
|
||
* SOLR-5279: Implicit properties don't seem to exist on core RELOAD
|
||
(elyograg, hossman, Steve Rowe)
|
||
|
||
* SOLR-5291: Solrj does not propagate the root cause to the user for many errors.
|
||
(Mark Miller)
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* SOLR-5044: Admin UI - Note on Core-Admin about directories while creating
|
||
core (steffkes)
|
||
|
||
* SOLR-5134: Have HdfsIndexOutput extend BufferedIndexOutput.
|
||
(Mark Miller, Uwe Schindler)
|
||
|
||
* SOLR-5057: QueryResultCache should not related with the order of fq's list (Feihong Huang via Erick Erickson)
|
||
|
||
* SOLR-4816: CloudSolrServer now uses multiple threads to send updates by default.
|
||
(Joel Bernstein via Mark Miller)
|
||
|
||
* SOLR-3530: Better error messages / Content-Type validation in SolrJ. (Mark Miller, hossman)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-4708: Enable ClusteringComponent by default in collection1 example.
|
||
The solr.clustering.enabled system property is set to 'true' by default.
|
||
(ehatcher, Dawid Weiss)
|
||
|
||
* SOLR-4914, SOLR-5162: Factor out core list persistence and discovery into a
|
||
new CoresLocator interface. (Alan Woodward, Shawn Heisey)
|
||
|
||
* SOLR-5056: Improve type safety of ConfigSolr class. (Alan Woodward)
|
||
|
||
* SOLR-4951: Better randomization of MergePolicy in Solr tests (hossman)
|
||
|
||
* SOLR-4953, SOLR-5108: Make XML Configuration parsing fail if an xpath matches
|
||
multiple nodes when only a single value or plugin instance is expected.
|
||
(hossman)
|
||
|
||
* The routing parameter "shard.keys" is deprecated as part of SOLR-5017 .The new parameter name is '_route_' .
|
||
The old parameter should continue to work for another release (Noble Paul)
|
||
|
||
* SOLR-5173: Solr-core's Maven configuration includes test-only Hadoop
|
||
dependencies as indirect compile-time dependencies.
|
||
(Chris Collins, Steve Rowe)
|
||
|
||
================== 4.4.0 ==================
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.4
|
||
Carrot2 3.6.2
|
||
Velocity 1.7 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.4.5
|
||
|
||
Upgrading from Solr 4.3.0
|
||
----------------------
|
||
|
||
* TieredMergePolicy and the various subtypes of LogMergePolicy no longer have
|
||
an explicit "setUseCompoundFile" method. Instead the behavior of new
|
||
segments is determined by the IndexWriter configuration, and the MergePolicy
|
||
is only consulted to determine if merge segements should use the compound
|
||
file format (based on the value of "setNoCFSRatio"). If you have explicitly
|
||
configured one of these classes using <mergePolicy> and include an init arg
|
||
like this...
|
||
<bool name="useCompoundFile">true</bool>
|
||
...this will now be treated as if you specified...
|
||
<useCompoundFile>true</useCompoundFile>
|
||
...directly on the <indexConfig> (overriding any value already set using that
|
||
syntax) and a warning will be logged to updated your configuration. Users
|
||
with an explicitly declared <mergePolicy> are encouraged to review the
|
||
current javadocs for their MergePolicy subclass and review their configured
|
||
options carefully. See SOLR-4941, SOLR-4934 and LUCENE-5038 for more
|
||
information.
|
||
|
||
* SOLR-4778: The signature of LogWatcher.registerListener has changed, from
|
||
(ListenerConfig, CoreContainer) to (ListenerConfig). Users implementing their
|
||
own LogWatcher classes will need to change their code accordingly.
|
||
|
||
* LUCENE-5063: ByteField and ShortField have been deprecated and will be removed
|
||
in 5.0. If you are still using these field types, you should migrate your
|
||
fields to TrieIntField.
|
||
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* SOLR-3251: Dynamically add fields to schema. (Steve Rowe, Robert Muir, yonik)
|
||
|
||
* SOLR-4761, SOLR-4976: Add option to plugin a merged segment warmer into solrconfig.xml.
|
||
Info about segments warmed in the background is available via infostream.
|
||
(Mark Miller, Ryan Ernst, Mike McCandless, Robert Muir)
|
||
|
||
* SOLR-3240: Add "spellcheck.collateMaxCollectDocs" option so that when testing
|
||
potential Collations against the index, SpellCheckComponent will only collect
|
||
n documents, thereby estimating the hit-count. This is a performance optimization
|
||
in cases where exact hit-counts are unnecessary. Also, when "collateExtendedResults"
|
||
is false, this optimization is always made (James Dyer).
|
||
|
||
* SOLR-4785: New MaxScoreQParserPlugin returning max() instead of sum() of terms (janhoy)
|
||
|
||
* SOLR-4234: Add support for binary files in ZooKeeper. (Eric Pugh via Mark Miller)
|
||
|
||
* SOLR-4048: Add findRecursive method to NamedList. (Shawn Heisey)
|
||
|
||
* SOLR-4228: SolrJ's SolrPing object has new methods for ping, enable, and
|
||
disable. (Shawn Heisey, hossman, Steve Rowe)
|
||
|
||
* SOLR-4893: Extend FieldMutatingUpdateProcessor.ConfigurableFieldNameSelector
|
||
to enable checking whether a field matches any schema field. To select field
|
||
names that don't match any fields or dynamic fields in the schema, add
|
||
<bool name="fieldNameMatchesSchemaField">false</bool> to an update
|
||
processor's configuration in solrconfig.xml. (Steve Rowe, hossman)
|
||
|
||
* SOLR-4921: Admin UI now supports adding documents to Solr (gsingers, steffkes)
|
||
|
||
* SOLR-4916: Add support to write and read Solr index files and transaction log
|
||
files to and from HDFS. (phunt, Mark Miller, Greg Chanan)
|
||
|
||
* SOLR-4892: Add FieldMutatingUpdateProcessorFactory subclasses
|
||
Parse{Date,Integer,Long,Float,Double,Boolean}UpdateProcessorFactory. These
|
||
factories have a default selector that matches all fields that either don’t
|
||
match any schema field, or are in the schema with the corresponding
|
||
typeClass. If they see a value that is not a CharSequence, or can't parse
|
||
the value, they leave it as is. For multi-valued fields, these processors
|
||
will not convert any values unless all are first successfully parsed, or
|
||
already are instances of the target class. Ordering the processors, e.g.
|
||
[Boolean, Long, Double, Date] will allow e.g. values ["2", "5", "8.6"] to
|
||
be left alone by the Boolean and Long processors, but then converted by the
|
||
Double processor. (Steve Rowe, hossman)
|
||
|
||
* SOLR-4972: Add PUT command to ZkCli tool. (Roman Shaposhnik via Mark Miller)
|
||
|
||
* SOLR-4973: Adding getter method for defaultCollection on CloudSolrServer.
|
||
(Furkan KAMACI via Mark Miller)
|
||
|
||
* SOLR-4897: Add solr/example/example-schemaless/, an example config set
|
||
for schemaless mode. (Steve Rowe)
|
||
|
||
* SOLR-4655: Add option to have Overseer assign generic node names so that
|
||
new addresses can host shards without naming confusion. (Mark Miller, Anshum Gupta)
|
||
|
||
* SOLR-4977: Add option to send IndexWriter's infostream to the logging system.
|
||
(Ryan Ernst via Robert Muir)
|
||
|
||
* SOLR-4693: A "deleteshard" collections API that unloads all replicas of a given
|
||
shard and then removes it from the cluster state. It will remove only those shards
|
||
which are INACTIVE or have no range (created for custom sharding).
|
||
(Anshum Gupta, shalin)
|
||
|
||
* SOLR-5003: CSV Update Handler supports optionally adding the line number/row id to
|
||
a document (gsingers)
|
||
|
||
* SOLR-5010: Add support for creating copy fields to the Schema REST API (gsingers)
|
||
|
||
* SOLR-4991: Register QParserPlugins as SolrInfoMBeans (ehatcher)
|
||
|
||
* SOLR-4943: Add a new system wide info admin handler that exposes the system info
|
||
that could previously only be retrieved using a SolrCore. (Mark Miller)
|
||
|
||
* SOLR-3076: Block joins. Documents and their sub-documents must be indexed
|
||
as a block.
|
||
{!parent which=<allParents>}<someChildren> takes in a query that matches child
|
||
documents and results in matches on their parents.
|
||
{!child of=<allParents>}<someParents> takes in a query that matches some parent
|
||
documents and results in matches on their children.
|
||
(Mikhail Khludnev, Vadim Kirilchuk, Alan Woodward, Tom Burton-West, Mike McCandless,
|
||
hossman, yonik)
|
||
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-4333: edismax parser to not double-escape colons if already escaped by
|
||
the client application (James Dyer, Robert J. van der Boon)
|
||
|
||
* SOLR-4776: Solrj doesn't return "between" count in range facets
|
||
(Philip K. Warren via shalin)
|
||
|
||
* SOLR-4616: HitRatio on caches is now exposed over JMX MBeans as a float.
|
||
(Greg Bowyer)
|
||
|
||
* SOLR-4803: Fixed core discovery mode (ie: new style solr.xml) to treat
|
||
'collection1' as the default core name. (hossman)
|
||
|
||
* SOLR-4790: Throw an error if a core has the same name as another core, both old and
|
||
new style solr.xml
|
||
|
||
* SOLR-4842: Fix facet.field local params from affecting other facet.field's.
|
||
(ehatcher, hossman)
|
||
|
||
* SOLR-4814: If a SolrCore cannot be created it should remove any information it
|
||
published about itself from ZooKeeper. (Mark Miller)
|
||
|
||
* SOLR-4863: Removed non-existent attribute sourceId from dynamic JMX stats
|
||
to fix AttributeNotFoundException (suganuma, hossman via shalin)
|
||
|
||
* SOLR-4891: JsonLoader should preserve field value types from the JSON content stream.
|
||
(Steve Rowe)
|
||
|
||
* SOLR-4805: SolreCore#reload should not call preRegister and publish a DOWN state to
|
||
ZooKeeper. (Mark Miller, Jared Rodriguez)
|
||
|
||
* SOLR-4899: When reconnecting after ZooKeeper expiration, we need to be willing to wait
|
||
forever, not just for 30 seconds. (Mark Miller)
|
||
|
||
* SOLR-4920: JdbcDataSource incorrectly suppresses exceptions when retrieving a connection from
|
||
a JNDI context and falls back to trying to use DriverManager to obtain a connection. Additionally,
|
||
if a SQLException is thrown while initializing a connection, such as in setAutoCommit(), the
|
||
connection will not be closed. (Chris Eldredge via shalin)
|
||
|
||
* SOLR-4915: The root cause should be returned to the user when a SolrCore create call fails.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4925 : Collection create throws NPE when 'numShards' param is missing (Noble Paul)
|
||
|
||
* SOLR-4910: persisting solr.xml is broken. More stringent testing of persistence fixed
|
||
up a number of issues and several bugs with persistence. Among them are
|
||
> don't persisting implicit properties
|
||
> should persist zkHost in the <solr> tag (user's list)
|
||
> reloading a core that has transient="true" returned an error. reload should load
|
||
a transient core if it's not yet loaded.
|
||
> No longer persisting loadOnStartup or transient core properties if they were not
|
||
specified in the original solr.xml
|
||
> Testing flushed out the fact that you couldn't swap a core marked transient=true
|
||
loadOnStartup=false because it hadn't been loaded yet.
|
||
> SOLR-4862, CREATE fails to persist schema, config, and dataDir
|
||
> SOLR-4363, not persisting coreLoadThreads in <solr> tag
|
||
> SOLR-3900, logWatcher properties not persisted
|
||
> SOLR-4850, cores defined as loadOnStartup=true, transient=false can't be searched
|
||
(Erick Erickson)
|
||
|
||
* SOLR-4923: Commits to non leaders as part of a request that also contain updates
|
||
can execute out of order. (hossman, Ricardo Merizalde, Mark Miller)
|
||
|
||
* SOLR-4932: persisting solr.xml saves some parameters it shouldn't when they weren't
|
||
defined in the original. Benign since the default values are saved, but still incorrect.
|
||
(Erick Erickson, thanks Shawn Heisey for helping test!)
|
||
|
||
* SOLR-4934, SOLR-4941: Fix handling of <mergePolicy> init arg
|
||
"useCompoundFile" needed after changes in LUCENE-5038 (hossman)
|
||
|
||
* SOLR-4456: Admin UI: Displays dashboard even if Solr is down (steffkes)
|
||
|
||
* SOLR-4949: UI Analysis page dropping characters from input box (steffkes)
|
||
|
||
* SOLR-4960: Fix race conditions in shutdown of CoreContainer
|
||
and getCore that could cause a request to attempt to use a core that
|
||
has shut down. (yonik)
|
||
|
||
* SOLR-4926: Fixed rare replication bug that normally only manifested when
|
||
using compound file format. (yonik, Mark Miller)
|
||
|
||
* SOLR-4974: Outgrowth of SOLR-4960 that includes transient cores and pending cores
|
||
(Erick Erickson)
|
||
|
||
* SOLR-3369: shards.tolerant=true is broken for group queries
|
||
(Russell Black, Martijn van Groningen, Jabouille jean Charles, Ryan McKinley via shalin)
|
||
|
||
* SOLR-4452: Hunspell stemmer should not merge duplicate dictionary entries (janhoy)
|
||
|
||
* SOLR-5000: ManagedIndexSchema doesn't persist uniqueKey tag after calling addFields
|
||
method. (Jun Ohtani, Steve Rowe)
|
||
|
||
* SOLR-4982: Creating a core while referencing system properties looks like it loses files
|
||
Actually, instanceDir, config, dataDir and schema are not dereferenced properly
|
||
when creating cores that reference sys vars (e.g. &dataDir=${dir}). In the dataDir
|
||
case in particular this leads to the index being put in a directory literally named
|
||
${dir} but on restart the sysvar will be properly dereferenced.
|
||
|
||
* SOLR-4788: Multiple Entities DIH delta import: dataimporter.[entityName].last_index_time
|
||
is empty. (chakming wong, James Dyer via shalin)
|
||
|
||
* SOLR-4978: Time is stripped from datetime column when imported into Solr date field
|
||
if convertType=true. (Bill Au, shalin)
|
||
|
||
* SOLR-5019: spurious ConcurrentModificationException when spell check component
|
||
was in use with filters. (yonik)
|
||
|
||
* SOLR-5018: The Overseer should avoid publishing the state for collections that do not
|
||
exist under the /collections zk node. (Mark Miller)
|
||
|
||
* SOLR-5028,SOLR-5029: ShardHandlerFactory was not being created properly when
|
||
using new-style solr.xml, and was not being persisted properly when using
|
||
old-style. (Tomás Fernández Löbbe, Ryan Ernst, Alan Woodward)
|
||
|
||
* SOLR-4997: The splitshard api doesn't call commit on new sub shards before
|
||
switching shard states. Multiple bugs related to sub shard recovery and
|
||
replication are also fixed. (shalin)
|
||
|
||
* SOLR-5034: A facet.query that parses or analyzes down to a null Query would
|
||
throw a NPE. Fixed. (David Smiley)
|
||
|
||
* SOLR-5039: Admin/Schema Browser displays -1 for term counts for multiValued fields.
|
||
|
||
* SOLR-5037: The CSV loader now accepts field names that are not in the schema.
|
||
(gsingers, ehatcher, Steve Rowe)
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* SOLR-4923: Commit to all nodes in a collection in parallel rather than locally and
|
||
then to all other nodes. (hossman, Ricardo Merizalde, Mark Miller)
|
||
|
||
* SOLR-3838: Admin UI - Multiple filter queries are not supported in Query UI (steffkes)
|
||
|
||
* SOLR-4719 : Admin UI - Default to wt=json on Query-Screen (steffkes)
|
||
|
||
* SOLR-4611: Admin UI - Analysis-Urls with empty parameters create empty result table
|
||
(steffkes)
|
||
|
||
* SOLR-4955: Admin UI - Show address bar on top for Schema + Config (steffkes)
|
||
|
||
* SOLR-4412: New parameter langid.lcmap to map detected language code to be placed
|
||
in "language" field (janhoy)
|
||
|
||
* SOLR-4815: Admin-UI - DIH: Let "commit" be checked by default (steffkes)
|
||
|
||
* SOLR-5002: optimize numDocs(Query,DocSet) when filterCache is null (Robert Muir)
|
||
|
||
* SOLR-5012: optimize search with filter when filterCache is null (Robert Muir)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-4737: Update Guava to 14.0.1 (Mark Miller)
|
||
|
||
* SOLR-2079: Add option to pass HttpServletRequest in the SolrQueryRequest context map.
|
||
(Tomás Fernández Löbbe via Robert Muir)
|
||
|
||
* SOLR-4738: Update Jetty to 8.1.10.v20130312 (Mark Miller, Robert Muir)
|
||
|
||
* SOLR-4749: Clean up and refactor CoreContainer code around solr.xml and SolrCore
|
||
management. (Mark Miller)
|
||
|
||
* SOLR-4547: Move logging of filenames on commit from INFO to DEBUG.
|
||
(Shawn Heisey, hossman)
|
||
|
||
* SOLR-4757: Change the example to use the new solr.xml format and core
|
||
discovery by directory structure. (Mark Miller)
|
||
|
||
* SOLR-4759: Velocity (/browse) template cosmetic cleanup.
|
||
(Mark Bennett, ehatcher)
|
||
|
||
* SOLR-4778: LogWatcher init code moved out of CoreContainer (Alan Woodward)
|
||
|
||
* SOLR-4784: Make class LuceneQParser public (janhoy)
|
||
|
||
* SOLR-4448: Allow the solr internal load balancer to be more easily pluggable.
|
||
(Philip Hoy via Robert Muir)
|
||
|
||
* SOLR-4224: Refactor JavaBinCodec input stream definition to enhance reuse.
|
||
(phunt via Mark Miller)
|
||
|
||
* SOLR-4931: SolrDeletionPolicy onInit and onCommit methods changed to override
|
||
exact signatures (with generics) from IndexDeletionPolicy (shalin)
|
||
|
||
* SOLR-4942: test improvements to randomize use of compound files (hosman)
|
||
|
||
* SOLR-4966: CSS, JS and other files in webapp without license (uschindler,
|
||
steffkes)
|
||
|
||
* SOLR-4986: Upgrade to Tika 1.4 (Markus Jelsma via janhoy)
|
||
|
||
* SOLR-4948, SOLR-5009: Tidied up CoreContainer construction logic.
|
||
(Alan Woodward, Uwe Schindler, Steve Rowe)
|
||
|
||
* LUCENE-5107: Properties files by Solr are now written in UTF-8 encoding,
|
||
Unicode is no longer escaped. Reading of legacy properties files with
|
||
\u escapes is still possible. (Uwe Schindler, Robert Muir)
|
||
|
||
================== 4.3.1 ==================
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.3
|
||
Carrot2 3.6.2
|
||
Velocity 1.7 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.4.5
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-4795: Sub shard leader should not accept any updates from parent after
|
||
it goes active (shalin)
|
||
|
||
* SOLR-4798: shard splitting does not respect the router for the collection
|
||
when executing the index split. One effect of this is that documents
|
||
may be placed in the wrong shard when the default compositeId router
|
||
is used in conjunction with IDs containing "!". (yonik)
|
||
|
||
* SOLR-4797: Shard splitting creates sub shards which have the wrong hash
|
||
range in cluster state. This happens when numShards is not a power of two
|
||
and router is compositeId. (shalin)
|
||
|
||
* SOLR-4791: solr.xml sharedLib does not work in 4.3.0 (Ryan Ernst, Jan Høydahl via
|
||
Erick Erickson)
|
||
|
||
* SOLR-4806: Shard splitting does not abort if WaitForState times out (shalin)
|
||
|
||
* SOLR-4807: The zkcli script now works with log4j. The zkcli.bat script
|
||
was broken on Windows in 4.3.0, now it works. (Shawn Heisey)
|
||
|
||
* SOLR-4813: Fix SynonymFilterFactory to allow init parameters for
|
||
tokenizer factory used when parsing synonyms file. (Shingo Sasaki, hossman)
|
||
|
||
* SOLR-4829: Fix transaction log leaks (a failure to clean up some old logs)
|
||
on a shard leader, or when unexpected exceptions are thrown during log
|
||
recovery. (Steven Bower, Mark Miller, yonik)
|
||
|
||
* SOLR-4751: Fix replication problem of files in sub directory of conf directory.
|
||
(Minoru Osuka via Koji)
|
||
|
||
* SOLR-4741: Deleting a collection should set DELETE_DATA_DIR to true.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4752: There are some minor bugs in the Collections API parameter
|
||
validation. (Mark Miller)
|
||
|
||
* SOLR-4563: RSS DIH-example not working (janhoy)
|
||
|
||
* SOLR-4796: zkcli.sh should honor JAVA_HOME (Roman Shaposhnik via Mark Miller)
|
||
|
||
* SOLR-4734: Leader election fails with an NPE if there is no UpdateLog.
|
||
(Mark Miller, Alexander Eibner)
|
||
|
||
* SOLR-4868: Setting the log level for the log4j root category results in
|
||
adding a new category, the empty string. (Shawn Heisey)
|
||
|
||
* SOLR-4855: DistributedUpdateProcessor doesn't check for peer sync requests (shalin)
|
||
|
||
* SOLR-4867: Admin UI - setting loglevel on root throws RangeError (steffkes)
|
||
|
||
* SOLR-4870: RecentUpdates.update() does not increment numUpdates loop counter
|
||
(Alexey Kudinov via shalin)
|
||
|
||
* SOLR-4877, LUCENE-5023: Removed SolrIndexSearcher#getDocSetNC()'s special
|
||
case for handling TermQuery to prevent NullPointerException if reader does
|
||
not have fields. (Bao Yang Yang, Uwe Schindler)
|
||
|
||
* SOLR-4881: Fix DocumentAnalysisRequestHandler to correctly use
|
||
EmptyEntityResolver to prevent loading of external entities like
|
||
UpdateRequestHandler does. (Hossman, Uwe Schindler)
|
||
|
||
* SOLR-4858: SolrCore reloading was broken when the UpdateLog
|
||
was enabled. (Hossman, Anshum Gupta, Alexey Serba, Mark Miller, yonik)
|
||
|
||
* SOLR-4853: Fixed SolrJettyTestBase so it may be reused by end users
|
||
(hossman)
|
||
|
||
* SOLR-4744: Update failure on sub shard is not propagated to clients by parent
|
||
shard (Anshum Gupta, yonik, shalin)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-4760: Include core name in logs when loading schema.
|
||
(Shawn Heisey)
|
||
|
||
================== 4.3.0 ==================
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.3
|
||
Carrot2 3.6.2
|
||
Velocity 1.7 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.4.5
|
||
|
||
Upgrading from Solr 4.2.0
|
||
----------------------
|
||
|
||
* In the schema REST API, the output path for copyFields and dynamicFields
|
||
has been changed from all lowercase "copyfields" and "dynamicfields" to
|
||
camelCase "copyFields" and "dynamicFields", respectively, to align with all
|
||
other schema REST API outputs, which use camelCase. The URL format remains
|
||
the same: all resource names are lowercase. See SOLR-4623 for details.
|
||
|
||
* Slf4j/logging jars are no longer included in the Solr webapp. All logging
|
||
jars are now in example/lib/ext. Changing logging impls is now as easy as
|
||
updating the jars in this folder with those necessary for the logging impl
|
||
you would like. If you are using another webapp container, these jars will
|
||
need to go in the corresponding location for that container.
|
||
In conjunction, the dist-excl-slf4j and dist-war-excl-slf4 build targets
|
||
have been removed since they are redundent. See the Slf4j documentation,
|
||
SOLR-3706, and SOLR-4651 for more details.
|
||
|
||
* The hardcoded SolrCloud defaults for 'hostContext="solr"' and
|
||
'hostPort="8983"' have been deprecated and will be removed in Solr 5.0.
|
||
Existing solr.xml files that do not have these options explicitly specified
|
||
should be updated accordingly. See SOLR-4622 for more details.
|
||
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* SOLR-4648 PreAnalyzedUpdateProcessorFactory allows using the functionality
|
||
of PreAnalyzedField with other field types. See javadoc for details and
|
||
examples. (Andrzej Bialecki)
|
||
|
||
* SOLR-4623: Provide REST API read access to all elements of the live schema.
|
||
Add a REST API request to return the entire live schema, in JSON, XML, and
|
||
schema.xml formats. Move REST API methods from package org.apache.solr.rest
|
||
to org.apache.solr.rest.schema, and rename base functionality REST API
|
||
classes to remove the current schema focus, to prepare for other non-schema
|
||
REST APIs. Change output path for copyFields and dynamicFields from
|
||
"copyfields" and "dynamicfields" (all lowercase) to "copyFields" and
|
||
"dynamicFields", respectively, to align with all other REST API outputs, which
|
||
use camelCase.
|
||
(Steve Rowe)
|
||
|
||
* SOLR-4658: In preparation for REST API requests that can modify the schema,
|
||
a "managed schema" is introduced.
|
||
Add '<schemaFactory class="ManagedSchemaFactory" mutable="true"/>' to solrconfig.xml
|
||
in order to use it, and to enable schema modifications via REST API requests.
|
||
(Steve Rowe, Robert Muir)
|
||
|
||
* SOLR-4656: Added two new highlight parameters, hl.maxMultiValuedToMatch and
|
||
hl.maxMultiValuedToExamine. maxMultiValuedToMatch stops looking for snippets after
|
||
finding the specified number of matches, no matter how far into the multivalued field
|
||
you've gone. maxMultiValuedToExamine stops looking for matches after the specified
|
||
number of multiValued entries have been examined. If both are specified, the limit
|
||
hit first stops the loop. Also this patch cuts down on the copying of the document
|
||
entries during highlighting. These optimizations are probably unnoticeable unless
|
||
there are a large number of entries in the multiValued field. Conspicuously, this will
|
||
prevent the "best" match from being found if it appears later in the MV list than the
|
||
cutoff specified by either of these params. (Erick Erickson)
|
||
|
||
* SOLR-4675: Improve PostingsSolrHighlighter to support per-field/query-time overrides
|
||
and add additional configuration parameters. See the javadocs for more details and
|
||
examples. (Robert Muir)
|
||
|
||
* SOLR-3755: A new collections api to add additional shards dynamically by splitting
|
||
existing shards. (yonik, Anshum Gupta, shalin)
|
||
|
||
* SOLR-4530: DIH: Provide configuration to use Tika's IdentityHtmlMapper
|
||
(Alexandre Rafalovitch via shalin)
|
||
|
||
* SOLR-4662: Discover SolrCores by directory structure rather than defining them
|
||
in solr.xml. Also, change the format of solr.xml to be closer to that of solrconfig.xml.
|
||
This version of Solr will ship the example in the old style, but you can manually
|
||
try the new style. Solr 4.4 will ship with the new style, and Solr 5.0 will remove
|
||
support for the old style. (Erick Erickson, Mark Miller)
|
||
Additional Work:
|
||
- SOLR-4347: Ensure that newly-created cores via Admin handler are persisted in solr.xml
|
||
(Erick Erickson)
|
||
- SOLR-1905: Cores created by the admin request handler should be persisted to solr.xml.
|
||
Also fixed a problem whereby properties like solr.solr.datadir would be persisted
|
||
to solr.xml. Also, cores that didn't happen to be loaded were not persisted.
|
||
(Erick Erickson)
|
||
|
||
* SOLR-4717/SOLR-1351: SimpleFacets now work with localParams allowing faceting on the
|
||
same field multiple ways (ryan, Uri Boness)
|
||
|
||
* SOLR-4671: CSVResponseWriter now supports pseudo fields. (ryan, nihed mbarek)
|
||
|
||
* SOLR-4358: HttpSolrServer sends the stream name and exposes 'useMultiPartPost'
|
||
(Karl Wright via ryan)
|
||
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-4543: setting shardHandlerFactory in solr.xml/solr.properties does not work.
|
||
(Ryan Ernst, Robert Muir via Erick Erickson)
|
||
|
||
* SOLR-4634: Fix scripting engine tests to work with Java 8's "Nashorn" Javascript
|
||
implementation. (Uwe Schindler)
|
||
|
||
* SOLR-4636: If opening a reader fails for some reason when opening a SolrIndexSearcher,
|
||
a Directory can be left unreleased. (Mark Miller)
|
||
|
||
* SOLR-4405: Admin UI - admin-extra files are not rendered into the core-menu (steffkes)
|
||
|
||
* SOLR-3956: Fixed group.facet=true to work with negative facet.limit
|
||
(Chris van der Merwe, hossman)
|
||
|
||
* SOLR-4650: copyField doesn't work with source globs that don't match any
|
||
explicit or dynamic fields. This regression was introduced in Solr 4.2.
|
||
(Daniel Collins, Steve Rowe)
|
||
|
||
* SOLR-4641: Schema now throws exception on illegal field parameters. (Robert Muir)
|
||
|
||
* SOLR-3758: Fixed SpellCheckComponent to work consistently with distributed grouping
|
||
(James Dyer)
|
||
|
||
* SOLR-4652: Fix broken behavior with shared libraries in resource loader for
|
||
solr.xml plugins. (Ryan Ernst, Robert Muir, Uwe Schindler)
|
||
|
||
* SOLR-4664: ZkStateReader should update aliases on construction.
|
||
(Mark Miller, Elodie Sannier)
|
||
|
||
* SOLR-4682: CoreAdminRequest.mergeIndexes can not merge multiple cores or indexDirs.
|
||
(Jason.D.Cao via shalin)
|
||
|
||
* SOLR-4581: When faceting on numeric fields in Solr 4.2, negative values (constraints)
|
||
were sorted incorrectly. (Alexander Buhr, shalin, yonik)
|
||
|
||
* SOLR-4699: The System admin handler should not assume a file system based data directory
|
||
location. (Mark Miller)
|
||
|
||
* SOLR-4695: Fix core admin SPLIT action to be useful with non-cloud setups (shalin)
|
||
|
||
* SOLR-4680: Correct example spellcheck configuration's queryAnalyzerFieldType and
|
||
use "text" field instead of narrower "name" field (ehatcher, Mark Bennett)
|
||
|
||
* SOLR-4702: Fix example /browse "Did you mean?" suggestion feature. (ehatcher, Mark Bennett)
|
||
|
||
* SOLR-4710: You cannot delete a collection fully from ZooKeeper unless all nodes are up and
|
||
functioning correctly. (Mark Miller)
|
||
|
||
* SOLR-4487: SolrExceptions thrown by HttpSolrServer will now contain the
|
||
proper HTTP status code returned by the remote server, even if that status
|
||
code is not something Solr itself returned -- eg: from the Servlet Container,
|
||
or an intermediate HTTP Proxy (hossman)
|
||
|
||
* SOLR-4661: Admin UI Replication details now correctly displays the current
|
||
replicable generation/version of the master. (hossman)
|
||
|
||
* SOLR-4716,SOLR-4584: SolrCloud request proxying does not work on Tomcat and
|
||
perhaps other non Jetty containers. (Po Rui, Yago Riveiro via Mark Miller)
|
||
|
||
* SOLR-4746: Distributed grouping used a NamedList instead of a SimpleOrderedMap
|
||
for the top level group commands, causing output formatting differences
|
||
compared to non-distributed grouping. (yonik)
|
||
|
||
* SOLR-4705: Fixed bug causing NPE when querying a single replica in SolrCloud
|
||
using the shards param (Raintung Li, hossman)
|
||
|
||
* SOLR-4729: LukeRequestHandler: Using a dynamic copyField source that is
|
||
not also a dynamic field triggers error message 'undefined field: "(glob)"'.
|
||
(Adam Hahn, hossman, Steve Rowe)
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-4653: Solr configuration should log inaccessible/ non-existent relative paths in lib
|
||
dir=... (Dawid Weiss)
|
||
|
||
* SOLR-4317: SolrTestCaseJ4: Can't avoid "collection1" convention (Tricia Jenkins, via Erick Erickson)
|
||
|
||
* SOLR-4571: SolrZkClient#setData should return Stat object. (Mark Miller)
|
||
|
||
* SOLR-4603: CachingDirectoryFactory should use an IdentityHashMap for
|
||
byDirectoryCache. (Mark Miller)
|
||
|
||
* SOLR-4544: Refactor HttpShardHandlerFactory so load-balancing logic can be customized.
|
||
(Ryan Ernst via Robert Muir)
|
||
|
||
* SOLR-4607: Use noggit 0.5 release jar rather than a forked copy. (Yonik Seeley, Robert Muir)
|
||
|
||
* SOLR-3706: Ship setup to log with log4j. (ryan, Mark Miller)
|
||
|
||
* SOLR-4651: Remove dist-excl-slf4j build target. (Shawn Heisey)
|
||
|
||
* SOLR-4622: The hardcoded SolrCloud defaults for 'hostContext="solr"' and
|
||
'hostPort="8983"' have been deprecated and will be removed in Solr 5.0.
|
||
Existing solr.xml files that do not have these options explicitly specified
|
||
should be updated accordingly. (hossman)
|
||
|
||
* SOLR-4672: Requests attempting to use SolrCores which had init failures
|
||
(that would be reported by CoreAdmin STATUS requests) now result in 500
|
||
error responses with the details about the init failure, instead of 404
|
||
error responses. (hossman)
|
||
|
||
* SOLR-4730: Make the wiki link more prominent in the release documentation.
|
||
(Uri Laserson via Robert Muir)
|
||
|
||
|
||
================== 4.2.1 ==================
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.3
|
||
Carrot2 3.6.2
|
||
Velocity 1.7 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.4.5
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-4567: copyField source glob matching explicit field(s) stopped working
|
||
in Solr 4.2. (Alexandre Rafalovitch, Steve Rowe)
|
||
|
||
* SOLR-4475: Fix various places that still assume File based paths even when
|
||
not using a file based DirectoryFactory. (Mark Miller)
|
||
|
||
* SOLR-4551: CachingDirectoryFactory needs to create CacheEntry's with the
|
||
fullpath not path. (Mark Miller)
|
||
|
||
* SOLR-4555: When forceNew is used with CachingDirectoryFactory#get, the old
|
||
CachValue should give up it's path as it will be used by a new Directory
|
||
instance. (Mark Miller)
|
||
|
||
* SOLR-4578: CoreAdminHandler#handleCreateAction gets a SolrCore and does not
|
||
close it in SolrCloud mode when a core with the same name already exists.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4574: The Collections API will silently return success on an unknown
|
||
ACTION parameter. (Mark Miller)
|
||
|
||
* SOLR-4576: Collections API validation errors should cause an exception on
|
||
clients and otherwise act as validation errors with the Core Admin API.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4577: The collections API should return responses (success or failure)
|
||
for each node it attempts to work with. (Mark Miller)
|
||
|
||
* SOLR-4568: The lastPublished state check before becoming a leader is not
|
||
working correctly. (Mark Miller)
|
||
|
||
* SOLR-4570: Even if an explicit shard id is used, ZkController#preRegister
|
||
should still wait to see the shard id in it's current ClusterState.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4585: The Collections API validates numShards with < 0 but should use
|
||
<= 0. (Mark Miller)
|
||
|
||
* SOLR-4592: DefaultSolrCoreState#doRecovery needs to check the CoreContainer
|
||
shutdown flag inside the recoveryLock sync block. (Mark Miller)
|
||
|
||
* SOLR-4595: CachingDirectoryFactory#close can throw a concurrent
|
||
modification exception. (Mark Miller)
|
||
|
||
* SOLR-4573: Accessing Admin UI files in SolrCloud mode logs warnings.
|
||
(Mark Miller, Phil John)
|
||
|
||
* SOLR-4594: StandardDirectoryFactory#remove accesses byDirectoryCache
|
||
without a lock. (Mark Miller)
|
||
|
||
* SOLR-4597: CachingDirectoryFactory#remove should not attempt to empty/remove
|
||
the index right away but flag for removal after close. (Mark Miller)
|
||
|
||
* SOLR-4598: The Core Admin unload command's option 'deleteDataDir', should use
|
||
the DirectoryFactory API to remove the data dir. (Mark Miller)
|
||
|
||
* SOLR-4599: CachingDirectoryFactory calls close(Directory) on forceNew if the
|
||
Directory has a refCnt of 0, but it should call closeDirectory(CacheValue).
|
||
(Mark Miller)
|
||
|
||
* SOLR-4602: ZkController#unregister should cancel it's election participation
|
||
before asking the Overseer to delete the SolrCore information. (Mark Miller)
|
||
|
||
* SOLR-4601: A Collection that is only partially created and then deleted will
|
||
leave pre allocated shard information in ZooKeeper. (Mark Miller)
|
||
|
||
* SOLR-4604: UpdateLog#init is over called on SolrCore#reload. (Mark Miller)
|
||
|
||
* SOLR-4605: Rollback does not work correctly. (Mark S, Mark Miller)
|
||
|
||
* SOLR-4609: The Collections API should only send the reload command to ACTIVE
|
||
cores. (Mark Miller)
|
||
|
||
* SOLR-4297: Atomic update request containing null=true sets all subsequent
|
||
fields to null (Ben Pennell, Rob, shalin)
|
||
|
||
* SOLR-4371: Admin UI - Analysis Screen shows empty result (steffkes)
|
||
|
||
* SOLR-4318: NPE encountered with querying with wildcards on a field that uses
|
||
the DefaultAnalyzer (i.e. no analysis chain defined). (Erick Erickson)
|
||
|
||
* SOLR-4361: DataImportHandler would throw UnsupportedOperationException if
|
||
handler-level parameters were specified containing periods in the name
|
||
(James Dyer)
|
||
|
||
* SOLR-4538: Date Math expressions were being truncated to 32 characters
|
||
when used in field:value queries in the lucene QParser. (hossman, yonik)
|
||
|
||
* SOLR-4617: SolrCore#reload needs to pass the deletion policy to the next
|
||
SolrCore through it's constructor rather than setting a field after.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4589: Fixed CPU spikes and poor performance in lazy field loading
|
||
of multivalued fields. (hossman)
|
||
|
||
* SOLR-4608: Update Log replay and PeerSync replay should use the default
|
||
processor chain to update the index. (Ludovic Boutros, yonik)
|
||
|
||
* SOLR-4625: The solr (lucene syntax) query parser lost top-level boost
|
||
values and top-level phrase slops on queries produced by nested
|
||
sub-parsers. (yonik)
|
||
|
||
* SOLR-4624: CachingDirectoryFactory does not need to support forceNew any
|
||
longer and it appears to be causing a missing close directory bug. forceNew
|
||
is no longer respected and will be removed in 4.3. (Mark Miller)
|
||
|
||
* SOLR-3819: Grouped faceting (group.facet=true) did not respect filter
|
||
exclusions. (Petter Remen, yonik)
|
||
|
||
* SOLR-4637: Replication can sometimes wait until shutdown or core unload until
|
||
removing some tmp directories. (Mark Miller)
|
||
|
||
* SOLR-4638: DefaultSolrCoreState#getIndexWriter(null) is a way to avoid
|
||
creating the IndexWriter earlier than necessary, but it's not
|
||
implemented quite right. (Mark Miller)
|
||
|
||
* SOLR-4640: CachingDirectoryFactory can fail to close directories in some race
|
||
conditions. (Mark Miller)
|
||
|
||
* SOLR-4642: QueryResultKey is not calculating the correct hashCode for filters.
|
||
(Joel Bernstein via Mark Miller)
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* SOLR-4569: waitForReplicasToComeUp should bail right away if it doesn't see the
|
||
expected slice in the clusterstate rather than waiting. (Mark Miller)
|
||
|
||
* SOLR-4311: Admin UI - Optimize Caching Behaviour (steffkes)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-4537: Clean up schema information REST API. (Steve Rowe)
|
||
|
||
* SOLR-4596: DistributedQueue should ensure its full path exists in the constructor.
|
||
(Mark Miller)
|
||
|
||
================== 4.2.0 ==================
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.3
|
||
Carrot2 3.6.2
|
||
Velocity 1.7 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.4.5
|
||
|
||
Upgrading from Solr 4.1.0
|
||
----------------------
|
||
|
||
(No upgrade instructions yet)
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* SOLR-4043: Add ability to get success/failure responses from Collections API.
|
||
(Raintung Li, Mark Miller)
|
||
|
||
* SOLR-2827: RegexpBoost Update Processor (janhoy)
|
||
|
||
* SOLR-4370: Allow configuring commitWithin to do hard commits.
|
||
(Mark Miller, Senthuran Sivananthan)
|
||
|
||
* SOLR-4451: SolrJ, and SolrCloud internals, now use SystemDefaultHttpClient
|
||
under the covers -- allowing many HTTP connection related properties to be
|
||
controlled via 'standard' java system properties. (hossman)
|
||
|
||
* SOLR-3855, SOLR-4490: Doc values support. (Adrien Grand, Robert Muir)
|
||
|
||
* SOLR-4417: Reopen the IndexWriter on SolrCore reload. (Mark Miller)
|
||
|
||
* SOLR-4477: Add support for queries (match-only) against docvalues fields.
|
||
(Robert Muir)
|
||
|
||
* SOLR-4488: Return slave replication details for a master if the master has
|
||
also acted like a slave. (Mark Miller)
|
||
|
||
* SOLR-4498: Add list command to ZkCLI that prints out the contents of
|
||
ZooKeeper. (Roman Shaposhnik via Mark Miller)
|
||
|
||
* SOLR-4481: SwitchQParserPlugin registered by default as 'switch' using
|
||
syntax: {!switch case=XXX case.foo=YYY case.bar=ZZZ default=QQQ}foo
|
||
(hossman)
|
||
|
||
* SOLR-4078: Allow custom naming of SolrCloud nodes so that a new host:port
|
||
combination can take over for a previous shard. (Mark Miller)
|
||
|
||
* SOLR-4210: Requests to a Collection that does not exist on the receiving node
|
||
should be proxied to a suitable node. (Mark Miller, Po Rui, yonik)
|
||
|
||
* SOLR-1365: New SweetSpotSimilarityFactory allows customizable TF/IDF based
|
||
Similarity when you know the optimal "Sweet Spot" of values for the field
|
||
length and TF scoring factors. (hossman)
|
||
|
||
* SOLR-4138: CurrencyField fields can now be used in a ValueSources to
|
||
get the "raw" value (using the default number of fractional digits) in
|
||
the default currency of the field type. There is also a new
|
||
currency(field,[CODE]) function for generating a ValueSource of the
|
||
"natural" value, converted to an optionally specified currency to
|
||
override the default for the field type.
|
||
(hossman)
|
||
|
||
* SOLR-4503: Add REST API methods, via Restlet integration, for reading schema
|
||
elements, at /schema/fields/, /schema/dynamicfields/, /schema/fieldtypes/,
|
||
and /schema/copyfields/. (Steve Rowe)
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-2850: Do not refine facets when minCount == 1
|
||
(Matt Smith, lundgren via Adrien Grand)
|
||
|
||
* SOLR-4309: /browse: Improve JQuery autosuggest behavior (janhoy)
|
||
|
||
* SOLR-4330: group.sort is ignored when using group.truncate and ex/tag
|
||
local params together (koji)
|
||
|
||
* SOLR-4321: Collections API will sometimes use a node more than once, even
|
||
when more unused nodes are available.
|
||
(Eric Falcao, Brett Hoerner, Mark Miller)
|
||
|
||
* SOLR-4345 : Solr Admin UI dosent work in IE 10 (steffkes)
|
||
|
||
* SOLR-4349 : Admin UI - Query Interface does not work in IE
|
||
(steffkes)
|
||
|
||
* SOLR-4359: The RecentUpdates#update method should treat a problem reading the
|
||
next record the same as a problem parsing the record - log the exception and
|
||
break. (Mark Miller)
|
||
|
||
* SOLR-4225: Term info page under schema browser shows incorrect count of terms
|
||
(steffkes)
|
||
|
||
* SOLR-3926: Solr should support better way of finding active sorts (Eirik Lygre via
|
||
Erick Erickson)
|
||
|
||
* SOLR-4342: Fix DataImportHandler stats to be a proper Map (hossman)
|
||
|
||
* SOLR-3967: langid.enforceSchema option checks source field instead of target field (janhoy)
|
||
|
||
* SOLR-4380: Replicate after startup option would not replicate until the
|
||
IndexWriter was lazily opened. (Mark Miller, Gregg Donovan)
|
||
|
||
* SOLR-4400: Deadlock can occur in a rare race between committing and
|
||
closing a SolrIndexWriter. (Erick Erickson, Mark Miller)
|
||
|
||
* SOLR-3655: A restarted node can briefly appear live and active before it really
|
||
is in some cases. (Mark Miller)
|
||
|
||
* SOLR-4426: NRTCachingDirectoryFactory does not initialize maxCachedMB and maxMergeSizeMB
|
||
if <directoryFactory> is not present in solrconfig.xml (Jack Krupansky via shalin)
|
||
|
||
* SOLR-4463: Fix SolrCoreState reference counting. (Mark Miller)
|
||
|
||
* SOLR-4459: The Replication 'index move' rather than copy optimization doesn't
|
||
kick in when using NRTCachingDirectory or the rate limiting feature.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4421,SOLR-4165: On CoreContainer shutdown, all SolrCores should publish their
|
||
state as DOWN. (Mark Miller, Markus Jelsma)
|
||
|
||
* SOLR-4467: Ephemeral directory implementations may not recover correctly
|
||
because the code to clear the tlog files on startup is off. (Mark Miller)
|
||
|
||
* SOLR-4413: Fix SolrCore#getIndexDir() to return the current index directory.
|
||
(Gregg Donovan, Mark Miller)
|
||
|
||
* SOLR-4469: A new IndexWriter must be opened on SolrCore reload when the index
|
||
directory has changed and the previous SolrCore's state should not be
|
||
propagated. (Mark Miller, Gregg Donovan)
|
||
|
||
* SOLR-4471: Replication occurs even when a slave is already up to date.
|
||
(Mark Miller, Andre Charton)
|
||
|
||
* SOLR-4484: ReplicationHandler#loadReplicationProperties still uses Files
|
||
rather than the Directory to try and read the replication properties files.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4352: /browse pagination now supports and preserves sort context
|
||
(Eric Spiegelberg, Erik Hatcher)
|
||
|
||
* LUCENE-4796, SOLR-4373: Fix concurrency issue in NamedSPILoader and
|
||
AnalysisSPILoader when doing concurrent core loads in multicore
|
||
Solr configs. (Uwe Schindler, Hossman)
|
||
|
||
* SOLR-4504: Fixed CurrencyField range queries to correctly exclude
|
||
documents w/o values (hossman)
|
||
|
||
* SOLR-4480: A trailing + or - caused the edismax parser to throw
|
||
an exception. (Fiona Tay, Jan Høydahl, yonik)
|
||
|
||
* SOLR-4507: The Cloud tab does not show up in the Admin UI if you
|
||
set zkHost in solr.xml. (Alfonso Presa, Mark Miller)
|
||
|
||
* SOLR-4505: Possible deadlock around SolrCoreState update lock.
|
||
(Erick Erickson, Mark Miller)
|
||
|
||
* SOLR-4511: When a new index is replicated into place, we need
|
||
to update the most recent replicatable index point without
|
||
doing a commit. This is important for repeater use cases, as
|
||
well as when nodes may switch master/slave roles.
|
||
(Mark Miller, Raúl Grande)
|
||
|
||
* SOLR-4515: CurrencyField's OpenExchangeRatesOrgProvider now requires
|
||
a ratesFileLocation init param, since the previous global default
|
||
no longer works (hossman)
|
||
|
||
* SOLR-4518: Improved CurrencyField error messages when attempting to
|
||
use a Currency that is not supported by the current JVM. (hossman)
|
||
|
||
* SOLR-3798: Fix copyField implementation in IndexSchema to handle
|
||
dynamic field references that aren't string-equal to the name of
|
||
the referenced dynamic field. (Steve Rowe)
|
||
|
||
* SOLR-4497: Collection Aliasing. (Mark Miller)
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* SOLR-4339: Admin UI - Display Field-Flags on Schema-Browser
|
||
(steffkes)
|
||
|
||
* SOLR-4340: Admin UI - Analysis's Button Spinner goes wild
|
||
(steffkes)
|
||
|
||
* SOLR-4341: Admin UI - Plugins/Stats Page contains loooong
|
||
Values which result in horizontal Scrollbar (steffkes)
|
||
|
||
* SOLR-3915: Color Legend for Cloud UI (steffkes)
|
||
|
||
* SOLR-4306: Utilize indexInfo=false when gathering core names in UI
|
||
(steffkes)
|
||
|
||
* SOLR-4284: Admin UI - make core list scrollable separate from the rest of
|
||
the UI (steffkes)
|
||
|
||
* SOLR-4364: Admin UI - Locale based number formatting (steffkes)
|
||
|
||
* SOLR-4521: Stop using the 'force' option for recovery replication. This
|
||
will keep some less common unnecessary replications from happening.
|
||
(Mark Miller, Simon Scofield)
|
||
|
||
* SOLR-4529: Improve Admin UI Dashboard legibility (Felix Buenemann via
|
||
steffkes)
|
||
|
||
* SOLR-4526: Admin UI depends on optional system info (Felix Buenemann via
|
||
steffkes)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-4259: Carrot2 dependency should be declared on the mini version, not the core.
|
||
(Dawid Weiss).
|
||
|
||
* SOLR-4348: Make the lock type configurable by system property by default.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4353: Renamed example jetty context file to reduce confusion (hossman)
|
||
|
||
* SOLR-4384: Make post.jar report timing information (Upayavira via janhoy)
|
||
|
||
* SOLR-4415: Add 'state' to shards (default to 'active') and read/write them to
|
||
ZooKeeper (Anshum Gupta via shalin)
|
||
|
||
* SOLR-4394: Tests and example configs demonstrating SSL with both server
|
||
and client certs (hossman)
|
||
|
||
* SOLR-3060: SurroundQParserPlugin highlighting tests
|
||
(Ahmet Arslan via hossman)
|
||
|
||
* SOLR-2470: Added more tests for VelocityResponseWriter
|
||
|
||
* SOLR-4471: Improve and clean up TestReplicationHandler.
|
||
(Amit Nithian via Mark Miller)
|
||
|
||
* SOLR-3843: Include lucene codecs jar and enable per-field postings and docvalues
|
||
support in the schema.xml (Robert Muir, Steve Rowe)
|
||
|
||
* SOLR-4511: Add new test for 'repeater' replication node. (Mark Miller)
|
||
|
||
* SOLR-4458: Sort directions (asc, desc) are now case insensitive
|
||
(Shawn Heisey via hossman)
|
||
|
||
* SOLR-2996: A bare * without a field specification is treated as *:*
|
||
by the lucene and edismax query paesers.
|
||
(hossman, Jan Høydahl, Alan Woodward, yonik)
|
||
|
||
* SOLR-4416: Upgrade to Tika 1.3. (Markus Jelsma via Mark Miller)
|
||
|
||
* SOLR-4200: Reduce INFO level logging from CachingDirectoryFactory
|
||
(Shawn Heisey via hossman)
|
||
|
||
================== 4.1.0 ==================
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.2
|
||
Carrot2 3.6.2
|
||
Velocity 1.7 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.4.5
|
||
|
||
Upgrading from Solr 4.0.0
|
||
----------------------
|
||
|
||
Custom java parsing plugins need to migrate from throwing the internal
|
||
ParseException to throwing SyntaxError.
|
||
|
||
BaseDistributedSearchTestCase now randomizes the servlet context it uses when
|
||
creating Jetty instances. Subclasses that assume a hard coded context of
|
||
"/solr" should either be fixed to use the "String context" variable, or should
|
||
take advantage of the new BaseDistributedSearchTestCase(String) constructor
|
||
to explicitly specify a fixed servlet context path. See SOLR-4136 for details.
|
||
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* SOLR-2255: Enhanced pivot faceting to use local-params in the same way that
|
||
regular field value faceting can. This means support for excluding a filter
|
||
query, using a different output key, and specifying 'threads' to do
|
||
facet.method=fcs concurrently. PivotFacetHelper now extends SimpleFacet and
|
||
the getFacetImplementation() extension hook was removed. (dsmiley)
|
||
|
||
* SOLR-3897: A highlighter parameter "hl.preserveMulti" to return all of the
|
||
values of a multiValued field in their original order when highlighting.
|
||
(Joel Bernstein via yonik)
|
||
|
||
* SOLR-3929: Support configuring IndexWriter max thread count in solrconfig.
|
||
(phunt via Mark Miller)
|
||
|
||
* SOLR-3906: Add support for AnalyzingSuggester (LUCENE-3842), where the
|
||
underlying analyzed form used for suggestions is separate from the returned
|
||
text. (Robert Muir)
|
||
|
||
* SOLR-3985: ExternalFileField caches can be reloaded on firstSearcher/
|
||
newSearcher events using the ExternalFileFieldReloader (Alan Woodward)
|
||
|
||
* SOLR-3911: Make Directory and DirectoryFactory first class so that the majority
|
||
of Solr's features work with any custom implementations. (Mark Miller)
|
||
Additional Work:
|
||
- SOLR-4032: Files larger than an internal buffer size fail to replicate.
|
||
(Mark Miller, Markus Jelsma)
|
||
- SOLR-4033: Consistently use the solrconfig.xml lockType everywhere.
|
||
(Mark Miller, Markus Jelsma)
|
||
- SOLR-4144: Replication using too much RAM. (yonik, Markus Jelsma)
|
||
- SOLR-4187: NPE on Directory release (Mark Miller, Markus Jelsma)
|
||
|
||
* SOLR-4051: Add <propertyWriter /> element to DIH's data-config.xml file,
|
||
allowing the user to specify the location, filename and Locale for
|
||
the "data-config.properties" file. Alternatively, users can specify their
|
||
own property writer implementation for greater control. This new configuration
|
||
element is optional, and defaults mimic prior behavior. The one exception is
|
||
that the "root" locale is default. Previously it was the machine's default locale.
|
||
(James Dyer)
|
||
|
||
* SOLR-4084: Add FuzzyLookupFactory, which is like AnalyzingSuggester except that
|
||
it can tolerate typos in the input. (Areek Zillur via Robert Muir)
|
||
|
||
* SOLR-4088: New and improved auto host detection strategy for SolrCloud.
|
||
(Raintung Li via Mark Miller)
|
||
|
||
* SOLR-3970: SystemInfoHandler now exposes more details about the
|
||
JRE/VM/Java version in use. (hossman)
|
||
|
||
* SOLR-4101: Add support for storing term offsets in the index via a
|
||
'storeOffsetsWithPositions' flag on field definitions in the schema.
|
||
(Tom Winch, Alan Woodward)
|
||
|
||
* SOLR-4093: Solr QParsers may now be directly invoked in the lucene
|
||
query syntax without the _query_ magic field hack.
|
||
Example: foo AND {!term f=myfield v=$qq}
|
||
(yonik)
|
||
|
||
* SOLR-4087: Add MAX_DOC_FREQ option to MoreLikeThis.
|
||
(Andrew Janowczyk via Mark Miller)
|
||
|
||
* SOLR-4114: Allow creating more than one shard per instance with the
|
||
Collection API. (Per Steffensen, Mark Miller)
|
||
|
||
* SOLR-3531: Allowing configuring maxMergeSizeMB and maxCachedMB when
|
||
using NRTCachingDirectoryFactory. (Andy Laird via Mark Miller)
|
||
|
||
* SOLR-4118: Fix replicationFactor to align with industry usage.
|
||
replicationFactor now means the total number of copies
|
||
of a document stored in the collection (or the total number of
|
||
physical indexes for a single logical slice of the collection).
|
||
For example if replicationFactor=3 then for a given shard there
|
||
will be a total of 3 replicas (one of which will normally be
|
||
designated as the leader.) (yonik)
|
||
|
||
* SOLR-4124: You should be able to set the update log directory with the
|
||
CoreAdmin API the same way as the data directory. (Mark Miller)
|
||
|
||
* SOLR-4028: When using ZK chroot, it would be nice if Solr would create the
|
||
initial path when it doesn't exist. (Tomás Fernández Löbbe via Mark Miller)
|
||
|
||
* SOLR-3948: Calculate/display deleted documents in admin interface.
|
||
(Shawn Heisey via Mark Miller)
|
||
|
||
* SOLR-4030: Allow rate limiting Directory IO based on the IO context.
|
||
(Mark Miller, Radim Kolar)
|
||
|
||
* SOLR-4166: LBHttpSolrServer ignores ResponseParser passed in constructor.
|
||
(Steve Molloy via Mark Miller)
|
||
|
||
* SOLR-4140: Allow access to the collections API through CloudSolrServer
|
||
without referencing an existing collection. (Per Steffensen via Mark Miller)
|
||
|
||
* SOLR-788: Distributed search support for MLT.
|
||
(Matthew Woytowitz, Mike Anderson, Jamie Johnson, Mark Miller)
|
||
|
||
* SOLR-4120: Collection API: Support for specifying a list of Solr addresses to
|
||
spread a new collection across. (Per Steffensen via Mark Miller)
|
||
|
||
* SOLR-4110: Configurable Content-Type headers for PHPResponseWriters and
|
||
PHPSerializedResponseWriter. (Dominik Siebel via Mark Miller)
|
||
|
||
* SOLR-1028: The ability to specify "transient" and "loadOnStartup" as a new properties of
|
||
<core> tags in solr.xml. Can specify "transientCacheSize" in the <cores> tag. Together
|
||
these allow cores to be loaded only when needed and only transientCacheSize transient
|
||
cores will be loaded at a time, the rest aged out on an LRU basis.
|
||
|
||
* SOLR-4246: When update.distrib is set to skip update processors before
|
||
the distributed update processor, always include the log update processor
|
||
so forwarded updates will still be logged. (yonik)
|
||
|
||
* SOLR-4230: The new Solr 4 spatial fields now work with the {!geofilt} and
|
||
{!bbox} query parsers. The score local-param works too. (David Smiley)
|
||
|
||
* SOLR-1972: Add extra statistics to RequestHandlers - 5 & 15-minute reqs/sec
|
||
rolling averages; median, 75th, 95th, 99th, 99.9th percentile request times
|
||
(Alan Woodward, Shawn Heisey, Adrien Grand, Uwe Schindler)
|
||
|
||
* SOLR-4271: Add support for PostingsHighlighter. (Robert Muir)
|
||
|
||
* SOLR-4255: The new Solr 4 spatial fields now have a 'filter' boolean local-param
|
||
that can be set to false to not filter. Its useful when there is already a spatial
|
||
filter query but you also need to sort or boost by distance. (David Smiley)
|
||
|
||
* SOLR-4265, SOLR-4283: Solr now parses request parameters (in URL or sent with POST
|
||
using content-type application/x-www-form-urlencoded) in its dispatcher code. It no
|
||
longer relies on special configuration settings in Tomcat or other web containers
|
||
to enable UTF-8 encoding, which is mandatory for correct Solr behaviour. Query
|
||
strings passed in via the URL need to be properly-%-escaped, UTF-8 encoded
|
||
bytes, otherwise Solr refuses to handle the request. The maximum length of
|
||
x-www-form-urlencoded POST parameters can now be configured through the
|
||
requestDispatcher/requestParsers/@formdataUploadLimitInKB setting in
|
||
solrconfig.xml (defaults to 2 MiB). Solr now works out of the box with
|
||
e.g. Tomcat, JBoss,... (Uwe Schindler, Dawid Weiss, Alex Rocher)
|
||
|
||
* SOLR-2201: DIH's "formatDate" function now supports a timezone as an optional
|
||
fourth parameter (James Dyer, Mark Waddle)
|
||
|
||
* SOLR-4302: New parameter 'indexInfo' (defaults to true) in CoreAdmin STATUS
|
||
command can be used to omit index specific information (Shahar Davidson via shalin)
|
||
|
||
* SOLR-2592: Collection specific document routing. The "compositeId"
|
||
router is the default for collections with hash based routing (i.e. when
|
||
numShards=N is specified on collection creation). Documents with ids sharing
|
||
the same domain (prefix) will be routed to the same shard, allowing for
|
||
efficient querying.
|
||
|
||
Example:
|
||
The following two documents will be indexed to the same shard
|
||
since they share the same domain "customerB!".
|
||
<code>
|
||
{"id" : "customerB!doc1" [...] }
|
||
{"id" : "customerB!doc2" [...] }
|
||
</code>
|
||
At query time, one can specify a "shard.keys" parameter that lists what
|
||
shards the query should cover.
|
||
http://.../query?q=my_query&shard.keys=customerB!
|
||
|
||
Collections that do not specify numShards at collection creation time
|
||
use custom sharding and default to the "implicit" router. Document updates
|
||
received by a shard will be indexed to that shard, unless a "_shard_" parameter
|
||
or document field names a different shard.
|
||
(Michael Garski, Dan Rosher, yonik)
|
||
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* SOLR-3788: Admin Cores UI should redirect to newly created core details
|
||
(steffkes)
|
||
|
||
* SOLR-3895: XML and XSLT UpdateRequestHandler should not try to resolve
|
||
external entities. This improves speed of loading e.g. XSL-transformed
|
||
XHTML documents. (Martin Herfurt, uschindler, hossman)
|
||
|
||
* SOLR-3614: Fix XML parsing in XPathEntityProcessor to correctly expand
|
||
named entities, but ignore external entities. (uschindler, hossman)
|
||
|
||
* SOLR-3734: Improve Schema-Browser Handling for CopyField using
|
||
dynamicField's (steffkes)
|
||
|
||
* SOLR-3941: The "commitOnLeader" part of distributed recovery can use
|
||
openSearcher=false. (Tomás Fernández Löbbe via Mark Miller)
|
||
|
||
* SOLR-4063: Allow CoreContainer to load multiple SolrCores in parallel rather
|
||
than just serially. (Mark Miller)
|
||
|
||
* SOLR-4199: When doing zk retries due to connection loss, rather than just
|
||
retrying for 2 minutes, retry in proportion to the session timeout.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4262: Replication Icon on Dashboard does not reflect Master-/Slave-
|
||
State (steffkes)
|
||
|
||
* SOLR-4264: Missing Error-Screen on UI's Cloud-Page (steffkes)
|
||
|
||
* SOLR-4261: Percentage Infos on Dashboard have a fixed width (steffkes)
|
||
|
||
* SOLR-3851: create a new core/delete an existing core should also update
|
||
the main/left list of cores on the admin UI (steffkes)
|
||
|
||
* SOLR-3840: XML query response display is unreadable in Solr Admin Query UI
|
||
(steffkes)
|
||
|
||
* SOLR-3982: Admin UI: Various Dataimport Improvements (steffkes)
|
||
|
||
* SOLR-4296: Admin UI: Improve Dataimport Auto-Refresh (steffkes)
|
||
|
||
* SOLR-3458: Allow multiple Items to stay open on Plugins-Page (steffkes)
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-4288: Improve logging for FileDataSource (basePath, relative
|
||
resources). (Dawid Weiss)
|
||
|
||
* SOLR-4007: Morfologik dictionaries not available in Solr field type
|
||
due to class loader lookup problems. (Lance Norskog, Dawid Weiss)
|
||
|
||
* SOLR-3560: Handle different types of Exception Messages for Logging UI
|
||
(steffkes)
|
||
|
||
* SOLR-3637: Commit Status at Core-Admin UI is always false (steffkes)
|
||
|
||
* SOLR-3917: Partial State on Schema-Browser UI is not defined for Dynamic
|
||
Fields & Types (steffkes)
|
||
|
||
* SOLR-3939: Consider a sync attempt from leader to replica that fails due
|
||
to 404 a success. (Mark Miller, Joel Bernstein)
|
||
|
||
* SOLR-3940: Rejoining the leader election incorrectly triggers the code path
|
||
for a fresh cluster start rather than fail over. (Mark Miller)
|
||
|
||
* SOLR-3961: Fixed error using LimitTokenCountFilterFactory
|
||
(Jack Krupansky, hossman)
|
||
|
||
* SOLR-3933: Distributed commits are not guaranteed to be ordered within a
|
||
request. (Mark Miller)
|
||
|
||
* SOLR-3939: An empty or just replicated index cannot become the leader of a
|
||
shard after a leader goes down. (Joel Bernstein, yonik, Mark Miller)
|
||
|
||
* SOLR-3971: A collection that is created with numShards=1 turns into a
|
||
numShards=2 collection after starting up a second core and not specifying
|
||
numShards. (Mark Miller)
|
||
|
||
* SOLR-3988: Fixed SolrTestCaseJ4.adoc(SolrInputDocument) to respect
|
||
field and document boosts (hossman)
|
||
|
||
* SOLR-3981: Fixed bug that resulted in document boosts being compounded in
|
||
<copyField/> destination fields. (hossman)
|
||
|
||
* SOLR-3920: Fix server list caching in CloudSolrServer when using more than one
|
||
collection list with the same instance. (Grzegorz Sobczyk, Mark Miller)
|
||
|
||
* SOLR-3938: prepareCommit command omits commitData causing a failure to trigger
|
||
replication to slaves. (yonik)
|
||
|
||
* SOLR-3992: QuerySenderListener doesn't populate document cache.
|
||
(Shotaro Kamio, yonik)
|
||
|
||
* SOLR-3995: Recovery may never finish on SolrCore shutdown if the last reference to
|
||
a SolrCore is closed by the recovery process. (Mark Miller)
|
||
|
||
* SOLR-3998: Atomic update on uniqueKey field itself causes duplicate document.
|
||
(Eric Spencer, yonik)
|
||
|
||
* SOLR-4001: In CachingDirectoryFactory#close, if there are still refs for a
|
||
Directory outstanding, we need to wait for them to be released before closing.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4005: If CoreContainer fails to register a created core, it should close it.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4009: OverseerCollectionProcessor is not resilient to many error conditions
|
||
and can stop running on errors. (Raintung Li, milesli, Mark Miller)
|
||
|
||
* SOLR-4019: Log stack traces for 503/Service Unavailable SolrException if not
|
||
thrown by PingRequestHandler. Do not log exceptions if a user tries to view a
|
||
hidden file using ShowFileRequestHandler. (Tomás Fernández Löbbe via James Dyer)
|
||
|
||
* SOLR-3589: Edismax parser does not honor mm parameter if analyzer splits a token.
|
||
(Tom Burton-West, Robert Muir)
|
||
|
||
* SOLR-4031: Upgrade to Jetty 8.1.7 to fix a bug where in very rare occasions
|
||
the content of two concurrent requests get mixed up. (Per Steffensen, yonik)
|
||
|
||
* SOLR-4060: ReplicationHandler can try and do a snappull and open a new IndexWriter
|
||
after shutdown has already occurred, leaving an IndexWriter that is not closed.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4055: Fix a thread safety issue with the Collections API that could
|
||
cause actions to be targeted at the wrong SolrCores.
|
||
(Raintung Li, Per Steffensen via Mark Miller)
|
||
|
||
* SOLR-3993: If multiple SolrCore's for a shard coexist on a node, on cluster
|
||
restart, leader election would stall until timeout, waiting to see all of
|
||
the replicas come up. (Mark Miller, Alexey Kudinov)
|
||
|
||
* SOLR-2045: Databases that require a commit to be issued before closing the
|
||
connection on a non-read-only database leak connections. Also expanded the
|
||
SqlEntityProcessor test to sometimes use Derby as well as HSQLDB (Derby is
|
||
one db affected by this bug). (Fenlor Sebastia, James Dyer)
|
||
|
||
* SOLR-4064: When there is an unexpected exception while trying to run the new
|
||
leader process, the SolrCore will not correctly rejoin the election.
|
||
(Po Rui via Mark Miller)
|
||
|
||
* SOLR-3989: SolrZkClient constructor dropped exception cause when throwing
|
||
a new RuntimeException. (Colin Bartolome, yonik)
|
||
|
||
* SOLR-4036: field aliases in fl should not cause properties of target field
|
||
to be used. (Martin Koch, yonik)
|
||
|
||
* SOLR-4003: The SolrZKClient clean method should not try and clear zk paths
|
||
that start with /zookeeper, as this can fail and stop the removal of
|
||
further nodes. (Mark Miller)
|
||
|
||
* SOLR-4076: SolrQueryParser should run fuzzy terms through
|
||
MultiTermAwareComponents to ensure that (for example) a fuzzy query of
|
||
foobar~2 is equivalent to FooBar~2 on a field that includes lowercasing.
|
||
(yonik)
|
||
|
||
* SOLR-4081: QueryParsing.toString, used during debugQuery=true, did not
|
||
correctly handle ExtendedQueries such as WrappedQuery
|
||
(used when cache=false), spatial queries, and frange queries.
|
||
(Eirik Lygre, yonik)
|
||
|
||
* SOLR-3959: Ensure the internal comma separator of poly fields is escaped
|
||
for CSVResponseWriter. (Areek Zillur via Robert Muir)
|
||
|
||
* SOLR-4075: A logical shard that has had all of it's SolrCores unloaded should
|
||
be removed from the cluster state. (Mark Miller, Gilles Comeau)
|
||
|
||
* SOLR-4034: Check if a collection already exists before trying to create a
|
||
new one. (Po Rui, Mark Miller)
|
||
|
||
* SOLR-4097: Race can cause NPE in logging line on first cluster state update.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4099: Allow the collection api work queue to make forward progress even
|
||
when it's watcher is not fired for some reason. (Raintung Li via Mark Miller)
|
||
|
||
* SOLR-3960: Fixed a bug where Distributed Grouping ignored PostFilters
|
||
(Nathan Visagan, hossman)
|
||
|
||
* SOLR-3842: DIH would not populate multivalued fields if the column name
|
||
derives from a resolved variable (James Dyer)
|
||
|
||
* SOLR-4117: Retrieving the size of the index may use the wrong index dir if
|
||
you are replicating.
|
||
(Mark Miller, Markus Jelsma)
|
||
|
||
* SOLR-2890: Fixed a bug that prevented omitNorms and omitTermFreqAndPositions
|
||
options from being respected in some <fieldType/> declarations (hossman)
|
||
|
||
* SOLR-4159: When we are starting a shard from rest, a potential leader should
|
||
not consider it's last published state when deciding if it can be the new
|
||
leader. (Mark Miller)
|
||
|
||
* SOLR-4158: When a core is registering in ZooKeeper it may not wait long
|
||
enough to find the leader due to how long the potential leader waits to see
|
||
replicas. (Mark Miller, Alain Rogister)
|
||
|
||
* SOLR-4162: ZkCli usage examples are not correct because the zkhost parameter
|
||
is not present and it is mandatory for all commands.
|
||
(Tomás Fernández Löbbe via Mark Miller)
|
||
|
||
* SOLR-4071: Validate that name is pass to Collections API create, and behave the
|
||
same way as on startup when collection.configName is not explicitly passed.
|
||
(Po Rui, Mark Miller)
|
||
|
||
* SOLR-4127: Added explicit error message if users attempt Atomic document
|
||
updates with either updateLog or DistribUpdateProcessor. (hossman)
|
||
|
||
* SOLR-4136: Fix SolrCloud behavior when using "hostContext" containing "_"
|
||
or"/" characters. This fix also makes SolrCloud more accepting of
|
||
hostContext values with leading/trailing slashes. (hossman)
|
||
|
||
* SOLR-4168: Ensure we are using the absolute latest index dir when getting
|
||
list of files for replication. (Mark Miller)
|
||
|
||
* SOLR-4171: CachingDirectoryFactory should not return any directories after it
|
||
has been closed. (Mark Miller)
|
||
|
||
* SOLR-4102: Fix UI javascript error if canonical hostname can not be resolved
|
||
(steffkes via hossman)
|
||
|
||
* SOLR-4178: ReplicationHandler should abort any current pulls and wait for
|
||
it's executor to stop during core close. (Mark Miller)
|
||
|
||
* SOLR-3918: Fixed the 'dist-war-excl-slf4j' ant target to exclude all
|
||
slf4j jars, so that the resulting war is usable as is provided the servlet
|
||
container includes the correct slf4j api and impl jars.
|
||
(Shawn Heisey, hossman)
|
||
|
||
* SOLR-4198: OverseerCollectionProcessor should implement ClosableThread.
|
||
(Mark Miller)
|
||
|
||
* SOLR-4213: Directories that are not shutdown until DirectoryFactory#close
|
||
do not have close listeners called on them. (Mark Miller)
|
||
|
||
* SOLR-4134: Standard (XML) request writer cannot "set" multiple values into
|
||
multivalued field with partial updates. (Luis Cappa Banda, Will Butler, shalin)
|
||
|
||
* SOLR-3972: Fix ShowFileRequestHandler to not log a warning in the
|
||
(expected) situation of a file not found. (hossman)
|
||
|
||
* SOLR-4133: Cannot "set" field to null with partial updates when using the
|
||
standard RequestWriter. (Will Butler, shalin)
|
||
|
||
* SOLR-4223: "maxFormContentSize" in jetty.xml is not picked up by jetty 8
|
||
so set it via solr webapp context file. (shalin)
|
||
|
||
* SOLR-4175:SearchComponent chain can't contain two components of the
|
||
same class and use debugQuery. (Tomás Fernández Löbbe via ehatcher)
|
||
|
||
* SOLR-4244: When coming back from session expiration we should not wait for
|
||
the leader to see us in the down state if we are the node that must become
|
||
the leader. (Mark Miller)
|
||
|
||
* SOLR-4245: When a core is registering with ZooKeeper, the timeout to find the
|
||
leader in the cluster state is 30 seconds rather than leaderVoteWait + extra
|
||
time. (Mark Miller)
|
||
|
||
* SOLR-4238: Fix jetty example requestLog config (jm via hossman)
|
||
|
||
* SOLR-4251: Fix SynonymFilterFactory when an optional tokenizerFactory is supplied.
|
||
(Chris Bleakley via rmuir)
|
||
|
||
* SOLR-4253: Misleading resource loading warning from Carrot2 clustering
|
||
component fixed (Stanisław Osiński)
|
||
|
||
* SOLR-4257: PeerSync updates and Log Replay updates should not wait for
|
||
a ZooKeeper connection in order to proceed. (yonik)
|
||
|
||
* SOLR-4045: SOLR admin page returns HTTP 404 on core names containing
|
||
a '.' (dot) (steffkes)
|
||
|
||
* SOLR-4176: analysis ui: javascript not properly handling URL decoding
|
||
of input (steffkes)
|
||
|
||
* SOLR-4079: Long core names break web gui appearance and functionality
|
||
(steffkes)
|
||
|
||
* SOLR-4263: Incorrect Link from Schema-Browser to Query From for Top-Terms
|
||
(steffkes)
|
||
|
||
* SOLR-3829: Admin UI Logging events broken if schema.xml defines a catch-all
|
||
dynamicField with type ignored (steffkes)
|
||
|
||
* SOLR-4275: Fix TrieTokenizer to no longer throw StringIndexOutOfBoundsException
|
||
in admin UI / AnalysisRequestHandler when you enter no number to tokenize.
|
||
(Uwe Schindler)
|
||
|
||
* SOLR-4279: Wrong exception message if _version_ field is multivalued (shalin)
|
||
|
||
* SOLR-4170: The 'backup' ReplicationHandler command can sometimes use a stale
|
||
index directory rather than the current one. (Mark Miller, Marcin Rzewucki)
|
||
|
||
* SOLR-3876: Solr Admin UI is completely dysfunctional on IE 9 (steffkes)
|
||
|
||
* SOLR-4112: Fixed DataImportHandler ZKAwarePropertiesWriter implementation so
|
||
import works fine with SolrCloud clusters (Deniz Durmus, James Dyer,
|
||
Erick Erickson, shalin)
|
||
|
||
* SOLR-4291: Harden the Overseer work queue thread loop. (Mark Miller)
|
||
|
||
* SOLR-3820: Solr Admin Query form is missing some edismax request parameters
|
||
(steffkes)
|
||
|
||
* SOLR-4217: post.jar no longer ignores -Dparams when -Durl is used.
|
||
(Alexandre Rafalovitch, ehatcher)
|
||
|
||
* SOLR-4303: On replication, if the generation of the master is lower than the
|
||
slave we need to force a full copy of the index. (Mark Miller, Gregg Donovan)
|
||
|
||
* SOLR-4266: HttpSolrServer does not release connection properly on exception
|
||
when no response parser is used. (Steve Molloy via Mark Miller)
|
||
|
||
* SOLR-2298: Updated JavaDoc for SolrDocument.addField and SolrInputDocument.addField
|
||
to have more information on name and value parameters. (Siva Natarajan)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-4106: Javac/ ivy path warnings with morfologik fixed by
|
||
upgrading to Morfologik 1.5.5 (Robert Muir, Dawid Weiss)
|
||
|
||
* SOLR-3899: SolrCore should not log at warning level when the index directory
|
||
changes - it's an info event. (Tobias Bergman, Mark Miller)
|
||
|
||
* SOLR-3861: Refactor SolrCoreState so that it's managed by SolrCore.
|
||
(Mark Miller, hossman)
|
||
|
||
* SOLR-3966: Eliminate superfluous warning from LanguageIdentifierUpdateProcessor
|
||
(Markus Jelsma via hossman)
|
||
|
||
* SOLR-3932: SolrCmdDistributorTest either takes 3 seconds or 3 minutes.
|
||
(yonik, Mark Miller)
|
||
|
||
* SOLR-3856: New tests for SqlEntityProcessor/CachedSqlEntityProcessor
|
||
(James Dyer)
|
||
|
||
* SOLR-4067: ZkStateReader#getLeaderProps should not return props for a leader
|
||
that it does not think is live. (Mark Miller)
|
||
|
||
* SOLR-4086: DIH refactor of VariableResolver and Evaluator. VariableResolver
|
||
and each built-in Evaluator are separate concrete classes. DateFormatEvaluator
|
||
now defaults with the ROOT Locale. However, users may specify a different
|
||
Locale using an optional new third parameter. (James Dyer)
|
||
|
||
* SOLR-3602: Update ZooKeeper to 3.4.5 (Mark Miller)
|
||
|
||
* SOLR-4095: DIH NumberFormatTransformer & DateFormatTransformer default to the
|
||
ROOT Locale if none is specified. These previously used the machine's default.
|
||
(James Dyer)
|
||
|
||
* SOLR-4096: DIH FileDataSource & FieldReaderDataSource default to UTF-8 encoding
|
||
if none is specified. These previously used the machine's default.
|
||
(James Dyer)
|
||
|
||
* SOLR-1916: DIH to not use Lucene-forbidden Java APIs
|
||
(default encoding, locale, etc.) (James Dyer, Robert Muir)
|
||
|
||
* SOLR-4111: SpellCheckCollatorTest#testContextSensitiveCollate to test against
|
||
both DirectSolrSpellChecker & IndexBasedSpellChecker
|
||
(Tomás Fernández Löbbe via James Dyer)
|
||
|
||
* SOLR-2141: Better test coverage for Evaluators (James Dyer)
|
||
|
||
* SOLR-4119: Update Guava to 13.0.1 (Mark Miller)
|
||
|
||
* SOLR-4074: Raise default ramBufferSizeMB to 100 from 32.
|
||
(yonik, Mark Miller)
|
||
|
||
* SOLR-4062: The update log location in solrconfig.xml should default to
|
||
${solr.ulog.dir} rather than ${solr.data.dir:} (Mark Miller)
|
||
|
||
* SOLR-4155: Upgrade Jetty to 8.1.8. (Robert Muir)
|
||
|
||
* SOLR-2986: Add MoreLikeThis to warning about features that require uniqueKey.
|
||
Also, change the warning to warn log level. (Shawn Heisey via Mark Miller)
|
||
|
||
* SOLR-4163: README improvements (Shawn Heisey via hossman)
|
||
|
||
* SOLR-4248: "ant eclipse" should declare .svn directories as derived.
|
||
(Shawn Heisey via Mark Miller)
|
||
|
||
* SOLR-3279: Upgrade Carrot2 to 3.6.2 (Stanisław Osiński)
|
||
|
||
* SOLR-4254: Harden the 'leader requests replica to recover' code path.
|
||
(Mark Miller, yonik)
|
||
|
||
* SOLR-4226: Extract fl parsing code out of ReturnFields constructor.
|
||
(Ryan Ernst via Robert Muir)
|
||
|
||
* SOLR-4208: ExtendedDismaxQParserPlugin has been refactored to make
|
||
subclassing easier. (Tomás Fernández Löbbe, hossman)
|
||
|
||
* SOLR-3735: Relocate the example mime-to-extension mapping, and
|
||
upgrade Velocity Engine to 1.7 (ehatcher)
|
||
|
||
* SOLR-4287: Removed "apache-" prefix from Solr distribution and artifact
|
||
filenames. (Ryan Ernst, Robert Muir, Steve Rowe)
|
||
|
||
* SOLR-4016: Deduplication does not work with atomic/partial updates so
|
||
disallow atomic update requests which change signature generating fields.
|
||
(Joel Nothman, yonik, shalin)
|
||
|
||
* SOLR-4308: Remove the problematic and now unnecessary log4j-over-slf4j.
|
||
(Mark Miller)
|
||
|
||
================== 4.0.0 ==================
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.2
|
||
Carrot2 3.5.0
|
||
Velocity 1.6.4 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.3.6
|
||
|
||
Upgrading from Solr 4.0.0-BETA
|
||
----------------------
|
||
|
||
In order to better support distributed search mode, the TermVectorComponent's
|
||
response format has been changed so that if the schema defines a
|
||
uniqueKeyField, then that field value is used as the "key" for each document in
|
||
it's response section, instead of the internal lucene doc id. Users w/o a
|
||
uniqueKeyField will continue to see the same response format. See SOLR-3229
|
||
for more details.
|
||
|
||
If you are using SolrCloud's distributed update request capabilities and a non
|
||
string type id field, you must re-index.
|
||
|
||
Upgrading from Solr 4.0.0-ALPHA
|
||
----------------------
|
||
|
||
Solr is now much more strict about requiring that the uniqueKeyField feature
|
||
(if used) must refer to a field which is not multiValued. If you upgrade from
|
||
an earlier version of Solr and see an error that your uniqueKeyField "can not
|
||
be configured to be multivalued" please add 'multiValued="false"' to the
|
||
<field /> declaration for your uniqueKeyField. See SOLR-3682 for more details.
|
||
|
||
In addition, please review the notes above about upgrading from 4.0.0-BETA
|
||
|
||
Upgrading from Solr 3.6
|
||
----------------------
|
||
|
||
* The Lucene index format has changed and as a result, once you upgrade,
|
||
previous versions of Solr will no longer be able to read your indices.
|
||
In a master/slave configuration, all searchers/slaves should be upgraded
|
||
before the master. If the master were to be updated first, the older
|
||
searchers would not be able to read the new index format.
|
||
|
||
* Setting abortOnConfigurationError=false is no longer supported
|
||
(since it has never worked properly). Solr will now warn you if
|
||
you attempt to set this configuration option at all. (see SOLR-1846)
|
||
|
||
* The default logic for the 'mm' param of the 'dismax' QParser has
|
||
been changed. If no 'mm' param is specified (either in the query,
|
||
or as a default in solrconfig.xml) then the effective value of the
|
||
'q.op' param (either in the query or as a default in solrconfig.xml
|
||
or from the 'defaultOperator' option in schema.xml) is used to
|
||
influence the behavior. If q.op is effectively "AND" then mm=100%.
|
||
If q.op is effectively "OR" then mm=0%. Users who wish to force the
|
||
legacy behavior should set a default value for the 'mm' param in
|
||
their solrconfig.xml file.
|
||
|
||
* The VelocityResponseWriter is no longer built into the core. Its JAR and
|
||
dependencies now need to be added (via <lib> or solr/home lib inclusion),
|
||
and it needs to be registered in solrconfig.xml like this:
|
||
<queryResponseWriter name="velocity" class="solr.VelocityResponseWriter"/>
|
||
|
||
* The update request parameter to choose Update Request Processor Chain is
|
||
renamed from "update.processor" to "update.chain". The old parameter was
|
||
deprecated but still working since Solr3.2, but is now removed
|
||
entirely.
|
||
|
||
* The <indexDefaults> and <mainIndex> sections of solrconfig.xml are discontinued
|
||
and replaced with the <indexConfig> section. There are also better defaults.
|
||
When migrating, if you don't know what your old settings mean, simply delete
|
||
both <indexDefaults> and <mainIndex> sections. If you have customizations,
|
||
put them in <indexConfig> section - with same syntax as before.
|
||
|
||
* Two of the SolrServer subclasses in SolrJ were renamed/replaced.
|
||
CommonsHttpSolrServer is now HttpSolrServer, and
|
||
StreamingUpdateSolrServer is now ConcurrentUpdateSolrServer.
|
||
|
||
* The PingRequestHandler no longer looks for a <healthcheck/> option in the
|
||
(legacy) <admin> section of solrconfig.xml. Users who wish to take
|
||
advantage of this feature should configure a "healthcheckFile" init param
|
||
directly on the PingRequestHandler. As part of this change, relative file
|
||
paths have been fixed to be resolved against the data dir. See the example
|
||
solrconfig.xml and SOLR-1258 for more details.
|
||
|
||
* Due to low level changes to support SolrCloud, the uniqueKey field can no
|
||
longer be populated via <copyField/> or <field default=...> in the
|
||
schema.xml. Users wishing to have Solr automatically generate a uniqueKey
|
||
value when adding documents should instead use an instance of
|
||
solr.UUIDUpdateProcessorFactory in their update processor chain. See
|
||
SOLR-2796 for more details.
|
||
|
||
In addition, please review the notes above about upgrading from 4.0.0-BETA, and 4.0.0-ALPHA
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* SOLR-3670: New CountFieldValuesUpdateProcessorFactory makes it easy to index
|
||
the number of values in another field for later use at query time. (hossman)
|
||
|
||
* SOLR-2768: new "mod(x,y)" function for computing the modulus of two value
|
||
sources. (hossman)
|
||
|
||
* SOLR-3238: Numerous small improvements to the Admin UI (steffkes)
|
||
|
||
* SOLR-3597: seems like a lot of wasted whitespace at the top of the admin screens
|
||
(steffkes)
|
||
|
||
* SOLR-3304: Added Solr adapters for Lucene 4's new spatial module. With
|
||
SpatialRecursivePrefixTreeFieldType ("location_rpt" in example schema), it is
|
||
possible to index a variable number of points per document (and sort on them),
|
||
index not just points but any Spatial4j supported shape such as polygons, and
|
||
to query on these shapes too. Polygons requires adding JTS to the classpath.
|
||
(David Smiley)
|
||
|
||
* SOLR-3825: Added optional capability to log what ids are in a response
|
||
(Scott Stults via gsingers)
|
||
|
||
* SOLR-3821: Added 'df' to the UI Query form (steffkes)
|
||
|
||
* SOLR-3822: Added hover titles to the edismax params on the UI Query form
|
||
(steffkes)
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* SOLR-3715: improve concurrency of the transaction log by removing
|
||
synchronization around log record serialization. (yonik)
|
||
|
||
* SOLR-3807: Currently during recovery we pause for a number of seconds after
|
||
waiting for the leader to see a recovering state so that any previous updates
|
||
will have finished before our commit on the leader - we don't need this wait
|
||
for peersync. (Mark Miller)
|
||
|
||
* SOLR-3837: When a leader is elected and asks replicas to sync back to him and
|
||
that fails, we should ask those nodes to recovery asynchronously rather than
|
||
synchronously. (Mark Miller)
|
||
|
||
* SOLR-3709: Cache the url list created from the ClusterState in CloudSolrServer
|
||
on each request. (Mark Miller)
|
||
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-3685: Solr Cloud sometimes skipped peersync attempt and replicated instead due
|
||
to tlog flags not being cleared when no updates were buffered during a previous
|
||
replication. (Markus Jelsma, Mark Miller, yonik)
|
||
|
||
* SOLR-3229: Fixed TermVectorComponent to work with distributed search
|
||
(Hang Xie, hossman)
|
||
|
||
* SOLR-3725: Fixed package-local-src-tgz target to not bring in unnecessary jars
|
||
and binary contents. (Michael Dodsworth via rmuir)
|
||
|
||
* SOLR-3649: Fixed bug in JavabinLoader that caused deleteById(List<String> ids)
|
||
to not work in SolrJ (siren)
|
||
|
||
* SOLR-3730: Rollback is not implemented quite right and can cause corner case fails in
|
||
SolrCloud tests. (rmuir, Mark Miller)
|
||
|
||
* SOLR-2981: Fixed StatsComponent to no longer return duplicated information
|
||
when requesting multiple stats.facet fields.
|
||
(Roman Kliewer via hossman)
|
||
|
||
* SOLR-3743: Fixed issues with atomic updates and optimistic concurrency in
|
||
conjunction with stored copyField targets by making real-time get never
|
||
return copyField targets. (yonik)
|
||
|
||
* SOLR-3746: Proper error reporting if updateLog is configured w/o necessary
|
||
"_version_" field in schema.xml (hossman)
|
||
|
||
* SOLR-3745: Proper error reporting if SolrCloud mode is used w/o
|
||
necessary "_version_" field in schema.xml (hossman)
|
||
|
||
* SOLR-3770: Overseer may lose updates to cluster state (siren)
|
||
|
||
* SOLR-3721: Fix bug that could theoretically allow multiple recoveries to run
|
||
briefly at the same time if the recovery thread join call was interrupted.
|
||
(Per Steffensen, Mark Miller)
|
||
|
||
* SOLR-3782: A leader going down while updates are coming in can cause shard
|
||
inconsistency. (Mark Miller)
|
||
|
||
* SOLR-3611: We do not show ZooKeeper data in the UI for a node that has children.
|
||
(Mark Miller)
|
||
|
||
* SOLR-3789: Fix bug in SnapPuller that caused "internal" compression to fail.
|
||
(siren)
|
||
|
||
* SOLR-3790: ConcurrentModificationException could be thrown when using hl.fl=*.
|
||
Fixed in r1231606. (yonik, koji)
|
||
|
||
* SOLR-3668: DataImport : Specifying Custom Parameters (steffkes)
|
||
|
||
* SOLR-3793: UnInvertedField faceting cached big terms in the filter
|
||
cache that ignored deletions, leading to duplicate documents in search
|
||
later when a filter of the same term was specified.
|
||
(Günter Hipler, hossman, yonik)
|
||
|
||
* SOLR-3679: Core Admin UI gives no feedback if "Add Core" fails (steffkes, hossman)
|
||
|
||
* SOLR-3795: Fixed LukeRequestHandler response to correctly return field name
|
||
strings in copyDests and copySources arrays (hossman)
|
||
|
||
* SOLR-3699: Fixed some Directory leaks when there were errors during SolrCore
|
||
or SolrIndexWriter initialization. (hossman)
|
||
|
||
* SOLR-3518: Include final 'hits' in log information when aggregating a
|
||
distributed request (Markus Jelsma via hossman)
|
||
|
||
* SOLR-3628: SolrInputField and SolrInputDocument are now consistently backed
|
||
by Collections passed in to setValue/setField, and defensively copy values
|
||
from Collections passed to addValue/addField
|
||
(Tom Switzer via hossman)
|
||
|
||
* SOLR-3595: CurrencyField now generates an appropriate error on schema init
|
||
if it is configured as multiValued - this has never been properly supported,
|
||
but previously failed silently in odd ways. (hossman)
|
||
|
||
* SOLR-3823: Fix 'bq' parsing in edismax. Please note that this required
|
||
reverting the negative boost support added by SOLR-3278 (hossman)
|
||
|
||
* SOLR-3827: Fix shareSchema=true in solr.xml
|
||
(Tomás Fernández Löbbe via hossman)
|
||
|
||
* SOLR-3809: Fixed config file replication when subdirectories are used
|
||
(Emmanuel Espina via hossman)
|
||
|
||
* SOLR-3828: Fixed QueryElevationComponent so that using 'markExcludes' does
|
||
not modify the result set or ranking of 'excluded' documents relative to
|
||
not using elevation at all. (Alexey Serba via hossman)
|
||
|
||
* SOLR-3569: Fixed debug output on distributed requests when there are no
|
||
results found. (David Bowen via hossman)
|
||
|
||
* SOLR-3811: Query Form using wrong values for dismax, edismax (steffkes)
|
||
|
||
* SOLR-3779: DataImportHandler's LineEntityProcessor when used in conjunction
|
||
with FileListEntityProcessor would only process the first file.
|
||
(Ahmet Arslan via James Dyer)
|
||
|
||
* SOLR-3791: CachedSqlEntityProcessor would throw a NullPointerException when
|
||
a query returns a row with a NULL key. (Steffen Moelter via James Dyer)
|
||
|
||
* SOLR-3833: When a election is started because a leader went down, the new
|
||
leader candidate should decline if the last state they published was not
|
||
active. (yonik, Mark Miller)
|
||
|
||
* SOLR-3836: When doing peer sync, we should only count sync attempts that
|
||
cannot reach the given host as success when the candidate leader is
|
||
syncing with the replicas - not when replicas are syncing to the leader.
|
||
(Mark Miller)
|
||
|
||
* SOLR-3835: In our leader election algorithm, if on connection loss we found
|
||
we did not create our election node, we should retry, not throw an exception.
|
||
(Mark Miller)
|
||
|
||
* SOLR-3834: A new leader on cluster startup should also run the leader sync
|
||
process in case there was a bad cluster shutdown. (Mark Miller)
|
||
|
||
* SOLR-3772: On cluster startup, we should wait until we see all registered
|
||
replicas before running the leader process - or if they all do not come up,
|
||
N amount of time. (Mark Miller)
|
||
|
||
* SOLR-3756: If we are elected the leader of a shard, but we fail to publish
|
||
this for any reason, we should clean up and re trigger a leader election.
|
||
(Mark Miller)
|
||
|
||
* SOLR-3812: ConnectionLoss during recovery can cause lost updates, leading to
|
||
shard inconsistency. (Mark Miller)
|
||
|
||
* SOLR-3813: When a new leader syncs, we need to ask all shards to sync back,
|
||
not just those that are active. (Mark Miller)
|
||
|
||
* SOLR-3641: CoreContainer is not persisting roles core attribute.
|
||
(hossman, Mark Miller)
|
||
|
||
* SOLR-3527: SolrCmdDistributor drops some of the important commit attributes
|
||
(maxOptimizeSegments, softCommit, expungeDeletes) when sending a commit to
|
||
replicas. (Andy Laird, Tomás Fernández Löbbe, Mark Miller)
|
||
|
||
* SOLR-3844: SolrCore reload can fail because it tries to remove the index
|
||
write lock while already holding it. (Mark Miller)
|
||
|
||
* SOLR-3831: Atomic updates do not distribute correctly to other nodes.
|
||
(Jim Musil, Mark Miller)
|
||
|
||
* SOLR-3465: Replication causes two searcher warmups.
|
||
(Michael Garski, Mark Miller)
|
||
|
||
* SOLR-3645: /terms should default to distrib=false. (Nick Cotton, Mark Miller)
|
||
|
||
* SOLR-3759: Various fixes to the example-DIH configs (Ahmet Arslan, hossman)
|
||
|
||
* SOLR-3777: Dataimport-UI does not send unchecked checkboxes (Glenn MacStravic
|
||
via steffkes)
|
||
|
||
* SOLR-3850: DataImportHandler "cacheKey" parameter was incorrectly renamed "cachePk"
|
||
(James Dyer)
|
||
|
||
* SOLR-3087: Fixed DOMUtil so that code doing attribute validation will
|
||
automatically ignore nodes in the reserved "xml" prefix - in particular this
|
||
fixes some bugs related to xinclude and fieldTypes.
|
||
(Amit Nithian, hossman)
|
||
|
||
* SOLR-3783: Fixed Pivot Faceting to work with facet.missing=true (hossman)
|
||
|
||
* SOLR-3869: A PeerSync attempt to it's replicas by a candidate leader should
|
||
not fail on o.a.http.conn.ConnectTimeoutException. (Mark Miller)
|
||
|
||
* SOLR-3875: Fixed index boosts on multi-valued fields when docBoost is used
|
||
(hossman)
|
||
|
||
* SOLR-3878: Exception when using open-ended range query with CurrencyField (janhoy)
|
||
|
||
* SOLR-3891: CacheValue in CachingDirectoryFactory cannot be used outside of
|
||
solr.core package. (phunt via Mark Miller)
|
||
|
||
* SOLR-3892: Inconsistent locking when accessing cache in CachingDirectoryFactory
|
||
from RAMDirectoryFactory and MockDirectoryFactory. (phunt via Mark Miller)
|
||
|
||
* SOLR-3883: Distributed indexing forwards non-applicable request params.
|
||
(Dan Sutton, Per Steffensen, yonik, Mark Miller)
|
||
|
||
* SOLR-3903: Fixed MissingFormatArgumentException in ConcurrentUpdateSolrServer
|
||
(hossman)
|
||
|
||
* SOLR-3916: Fixed whitespace bug in parsing the fl param (hossman)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-3690: Fixed binary release packages to include dependencies needed for
|
||
the solr-test-framework (hossman)
|
||
|
||
* SOLR-2857: The /update/json and /update/csv URLs were restored to aid
|
||
in the migration of existing clients. (yonik)
|
||
|
||
* SOLR-3691: SimplePostTool: Mode for crawling/posting web pages
|
||
See http://wiki.apache.org/solr/ExtractingRequestHandler for examples (janhoy)
|
||
|
||
* SOLR-3707: Upgrade Solr to Tika 1.2 (janhoy)
|
||
|
||
* SOLR-2747: Updated changes2html.pl to handle Solr's CHANGES.txt; added
|
||
target 'changes-to-html' to solr/build.xml.
|
||
(Steve Rowe, Robert Muir)
|
||
|
||
* SOLR-3752: When a leader goes down, have the Overseer clear the leader state
|
||
in cluster.json (Mark Miller)
|
||
|
||
* SOLR-3751: Add defensive checks for SolrCloud updates and requests that ensure
|
||
the local state matches what we can tell the request expected. (Mark Miller)
|
||
|
||
* SOLR-3773: Hash based on the external String id rather than the indexed
|
||
representation for distributed updates. (Michael Garski, yonik, Mark Miller)
|
||
|
||
* SOLR-3780: Maven build: Make solrj tests run separately from solr-core.
|
||
(Steve Rowe)
|
||
|
||
* SOLR-3772: Optionally, on cluster startup, we can wait until we see all registered
|
||
replicas before running the leader process - or if they all do not come up,
|
||
N amount of time. (Jan Høydahl, Per Steffensen, Mark Miller)
|
||
|
||
* SOLR-3750: Optionally, on session expiration, we can explicitly wait some time before
|
||
running the leader sync process so that we are sure every node participates.
|
||
(Per Steffensen, Mark Miller)
|
||
|
||
* SOLR-3824: Velocity: Error messages from search not displayed (janhoy)
|
||
|
||
* SOLR-3826: Test framework improvements for specifying coreName on initCore
|
||
(Amit Nithian, hossman)
|
||
|
||
* SOLR-3749: Allow default UpdateLog syncLevel to be configured by
|
||
solrconfig.xml (Raintung Li, Mark Miller)
|
||
|
||
* SOLR-3845: Rename numReplicas to replicationFactor in Collections API.
|
||
(yonik, Mark Miller)
|
||
|
||
* SOLR-3815: SolrCloud - Add properties such as "range" to shards, which changes
|
||
the clusterstate.json and puts the shard replicas under "replicas". (yonik)
|
||
|
||
* SOLR-3871: SyncStrategy should use an executor for the threads it creates to
|
||
request recoveries. (Mark Miller)
|
||
|
||
* SOLR-3870: SyncStrategy should have a close so it can abort earlier on
|
||
shutdown. (Mark Miller)
|
||
|
||
|
||
================== 4.0.0-BETA ===================
|
||
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.1
|
||
Carrot2 3.5.0
|
||
Velocity 1.6.4 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.3.6
|
||
|
||
Upgrading from Solr 4.0.0-ALPHA
|
||
----------------------
|
||
|
||
Solr is now much more strict about requiring that the uniqueKeyField feature
|
||
(if used) must refer to a field which is not multiValued. If you upgrade from
|
||
an earlier version of Solr and see an error that your uniqueKeyField "can not
|
||
be configured to be multivalued" please add 'multiValued="false"' to the
|
||
<field /> declaration for your uniqueKeyField. See SOLR-3682 for more details.
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* LUCENE-4201: Added JapaneseIterationMarkCharFilterFactory to normalize Japanese
|
||
iteration marks. (Robert Muir, Christian Moen)
|
||
|
||
* SOLR-1856: In Solr Cell, literals should override Tika-parsed values.
|
||
Patch adds a param "literalsOverride" which defaults to true, but can be set
|
||
to "false" to let Tika-parsed values be appended to literal values (Chris Harris, janhoy)
|
||
|
||
* SOLR-3488: Added a Collection management API for SolrCloud.
|
||
(Tommaso Teofili, Sami Siren, yonik, Mark Miller)
|
||
|
||
* SOLR-3559: Full deleteByQuery support with SolrCloud distributed indexing.
|
||
All replicas of a shard will be consistent, even if updates arrive in a
|
||
different order on different replicas. (yonik)
|
||
|
||
* SOLR-1929: Index encrypted documents with ExtractingUpdateRequestHandler.
|
||
By supplying resource.password=<mypw> or specifying an external file with regular
|
||
expressions matching file names, Solr will decrypt and index PDFs and DOCX formats.
|
||
(janhoy, Yiannis Pericleous)
|
||
|
||
* SOLR-3562: Add options to remove instance dir or data dir on core unload.
|
||
(Mark Miller, Per Steffensen)
|
||
|
||
* SOLR-2702: The default directory factory was changed to NRTCachingDirectoryFactory
|
||
which wraps the StandardDirectoryFactory and caches small files for improved
|
||
Near Real-time (NRT) performance. (Mark Miller, yonik)
|
||
|
||
* SOLR-2616: Include a sample java util logging configuration file.
|
||
(David Smiley, Mark Miller)
|
||
|
||
* SOLR-3460: Add cloud-scripts directory and a zkcli.sh|bat tool for easy scripting
|
||
and interaction with ZooKeeper. (Mark Miller)
|
||
|
||
* SOLR-1725: StatelessScriptUpdateProcessorFactory allows users to implement
|
||
the full ScriptUpdateProcessor API using any scripting language with a
|
||
javax.script.ScriptEngineFactory
|
||
(Uri Boness, ehatcher, Simon Rosenthal, hossman)
|
||
|
||
* SOLR-139: Change to updateable documents to create the document if it doesn't
|
||
already exist. To assert that the document must exist, use the optimistic
|
||
concurrency feature by specifying a _version_ of 1. (yonik)
|
||
|
||
* LUCENE-2510, LUCENE-4044: Migrated Solr's Tokenizer-, TokenFilter-, and
|
||
CharFilterFactories to the lucene-analysis module. To add new analysis
|
||
modules to Solr (like ICU, SmartChinese, Morfologik,...), just drop in
|
||
the JAR files from Lucene's binary distribution into your Solr instance's
|
||
lib folder. The factories are automatically made available with SPI.
|
||
(Chris Male, Robert Muir, Uwe Schindler)
|
||
|
||
* SOLR-3634, SOLR-3635: CoreContainer and CoreAdminHandler will now remember
|
||
and report back information about failures to initialize SolrCores. These
|
||
failures will be accessible from the web UI and CoreAdminHandler STATUS
|
||
command until they are "reset" by creating/renaming a SolrCore with the
|
||
same name. (hossman, steffkes)
|
||
|
||
* SOLR-1280: Added commented-out example of the new script update processor
|
||
to the example configuration. See http://wiki.apache.org/solr/ScriptUpdateProcessor (ehatcher)
|
||
|
||
* SOLR-3672: SimplePostTool: Improvements for posting files
|
||
Support for auto mode, recursive and wildcards (janhoy)
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* SOLR-3708: Add hashCode to ClusterState so that structures built based on the
|
||
ClusterState can be easily cached. (Mark Miller)
|
||
|
||
* SOLR-3709: Cache the url list created from the ClusterState in CloudSolrServer on each
|
||
request. (Mark Miller, yonik)
|
||
|
||
* SOLR-3710: Change CloudSolrServer so that update requests are only sent to leaders by
|
||
default. (Mark Miller)
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-3582: Our ZooKeeper watchers respond to session events as if they are change events,
|
||
creating undesirable side effects. (Trym R. Møller, Mark Miller)
|
||
|
||
* SOLR-3467: ExtendedDismax escaping is missing several reserved characters
|
||
(Michael Dodsworth via janhoy)
|
||
|
||
* SOLR-3587: After reloading a SolrCore, the original Analyzer is still used rather than a new
|
||
one. (Alexey Serba, yonik, rmuir, Mark Miller)
|
||
|
||
* LUCENE-4185: Fix a bug where CharFilters were wrongly being applied twice. (Michael Froh, rmuir)
|
||
|
||
* SOLR-3610: After reloading a core, indexing would fail on any newly added fields to the schema. (Brent Mills, rmuir)
|
||
|
||
* SOLR-3377: edismax fails to correctly parse a fielded query wrapped by parens.
|
||
This regression was introduced in 3.6. (Bernd Fehling, Jan Høydahl, yonik)
|
||
|
||
* SOLR-3621: Fix rare concurrency issue when opening a new IndexWriter for replication or rollback.
|
||
(Mark Miller)
|
||
|
||
* SOLR-1781: Replication index directories not always cleaned up.
|
||
(Markus Jelsma, Terje Sten Bjerkseth, Mark Miller)
|
||
|
||
* SOLR-3639: Update ZooKeeper to 3.3.6 for a variety of bug fixes. (Mark Miller)
|
||
|
||
* SOLR-3629: Typo in solr.xml persistence when overriding the solrconfig.xml
|
||
file name using the "config" attribute prevented the override file from being
|
||
used. (Ryan Zezeski, hossman)
|
||
|
||
* SOLR-3642: Correct broken check for multivalued fields in stats.facet
|
||
(Yandong Yao, hossman)
|
||
|
||
* SOLR-3660: Velocity: Link to admin page broken (janhoy)
|
||
|
||
* SOLR-3658: Adding thousands of docs with one UpdateProcessorChain instance can briefly create
|
||
spikes of threads in the thousands. (yonik, Mark Miller)
|
||
|
||
* SOLR-3656: A core reload now always uses the same dataDir. (Mark Miller, yonik)
|
||
|
||
* SOLR-3662: Core reload bugs: a reload always obtained a non-NRT searcher, which
|
||
could go back in time with respect to the previous core's NRT searcher. Versioning
|
||
did not work correctly across a core reload, and update handler synchronization
|
||
was changed to synchronize on core state since more than on update handler
|
||
can coexist for a single index during a reload. (yonik)
|
||
|
||
* SOLR-3663: There are a couple of bugs in the sync process when a leader goes down and a
|
||
new leader is elected. (Mark Miller)
|
||
|
||
* SOLR-3623: Fixed inconsistent treatment of third-party dependencies for
|
||
solr contribs analysis-extras & uima (hossman)
|
||
|
||
* SOLR-3652: Fixed range faceting to error instead of looping infinitely
|
||
when 'gap' is zero -- or effectively zero due to floating point arithmetic
|
||
underflow. (hossman)
|
||
|
||
* SOLR-3648: Fixed VelocityResponseWriter template loading in SolrCloud mode.
|
||
For the example configuration, this means /browse now works with SolrCloud.
|
||
(janhoy, ehatcher)
|
||
|
||
* SOLR-3677: Fixed misleading error message in web ui to distinguish between
|
||
no SolrCores loaded vs. no /admin/ handler available.
|
||
(hossman, steffkes)
|
||
|
||
* SOLR-3428: SolrCmdDistributor flushAdds/flushDeletes can cause repeated
|
||
adds/deletes to be sent (Mark Miller, Per Steffensen)
|
||
|
||
* SOLR-3647: DistributedQueue should use our Solr zk client rather than the std zk
|
||
client. ZooKeeper expiration can be permanent otherwise. (Mark Miller)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-3524: Make discarding punctuation configurable in JapaneseTokenizerFactory.
|
||
The default is to discard punctuation, but this is overridable as an expert option.
|
||
(Kazuaki Hiraga, Jun Ohtani via Christian Moen)
|
||
|
||
* SOLR-1770: Move the default core instance directory into a collection1 folder.
|
||
(Mark Miller)
|
||
|
||
* SOLR-3355: Add shard and collection to SolrCore statistics. (Michael Garski, Mark Miller)
|
||
|
||
* SOLR-3575: solr.xml should default to persist=true (Mark Miller)
|
||
|
||
* SOLR-3563: Unloading all cores in a SolrCloud collection will now cause the removal of
|
||
that collection's meta data from ZooKeeper. (Mark Miller, Per Steffensen)
|
||
|
||
* SOLR-3599: Add zkClientTimeout to solr.xml so that it's obvious how to change it and so
|
||
that you can change it with a system property. (Mark Miller)
|
||
|
||
* SOLR-3609: Change Solr's expanded webapp directory to be at a consistent path called
|
||
solr-webapp rather than a temporary directory. (Mark Miller)
|
||
|
||
* SOLR-3600: Raise the default zkClientTimeout from 10 seconds to 15 seconds. (Mark Miller)
|
||
|
||
* SOLR-3215: Clone SolrInputDocument when distrib indexing so that update processors after
|
||
the distrib update process do not process the document twice. (Mark Miller)
|
||
|
||
* SOLR-3683: Improved error handling if an <analyzer> contains both an
|
||
explicit class attribute, as well as nested factories. (hossman)
|
||
|
||
* SOLR-3682: Fail to parse schema.xml if uniqueKeyField is multivalued (hossman)
|
||
|
||
* SOLR-2115: DIH no longer requires the "config" parameter to be specified in solrconfig.xml.
|
||
Instead, the configuration is loaded and parsed with every import. This allows the use of
|
||
a different configuration with each import, and makes correcting configuration errors simpler.
|
||
Also, the configuration itself can be passed using the "dataConfig" parameter rather than
|
||
using a file (this previously worked in debug mode only). When configuration errors are
|
||
encountered, the error message is returned in XML format. (James Dyer)
|
||
|
||
* SOLR-3439: Make SolrCell easier to use out of the box. Also improves "/browse" to display
|
||
rich-text documents correctly, along with facets for author and content_type.
|
||
With the new "content" field, highlighting of body is supported. See also SOLR-3672 for
|
||
easier posting of a whole directory structure. (Jack Krupansky, janhoy)
|
||
|
||
* SOLR-3579: SolrCloud view should default to the graph view rather than tree view.
|
||
(steffkes, Mark Miller)
|
||
|
||
================== 4.0.0-ALPHA ==================
|
||
More information about this release, including any errata related to the
|
||
release notes, upgrade instructions, or other changes may be found online at:
|
||
https://wiki.apache.org/solr/Solr4.0
|
||
|
||
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Tika 1.1
|
||
Carrot2 3.5.0
|
||
Velocity 1.6.4 and Velocity Tools 2.0
|
||
Apache UIMA 2.3.1
|
||
Apache ZooKeeper 3.3.4
|
||
|
||
|
||
Upgrading from Solr 3.6-dev
|
||
----------------------
|
||
|
||
* The Lucene index format has changed and as a result, once you upgrade,
|
||
previous versions of Solr will no longer be able to read your indices.
|
||
In a master/slave configuration, all searchers/slaves should be upgraded
|
||
before the master. If the master were to be updated first, the older
|
||
searchers would not be able to read the new index format.
|
||
|
||
* Setting abortOnConfigurationError=false is no longer supported
|
||
(since it has never worked properly). Solr will now warn you if
|
||
you attempt to set this configuration option at all. (see SOLR-1846)
|
||
|
||
* The default logic for the 'mm' param of the 'dismax' QParser has
|
||
been changed. If no 'mm' param is specified (either in the query,
|
||
or as a default in solrconfig.xml) then the effective value of the
|
||
'q.op' param (either in the query or as a default in solrconfig.xml
|
||
or from the 'defaultOperator' option in schema.xml) is used to
|
||
influence the behavior. If q.op is effectively "AND" then mm=100%.
|
||
If q.op is effectively "OR" then mm=0%. Users who wish to force the
|
||
legacy behavior should set a default value for the 'mm' param in
|
||
their solrconfig.xml file.
|
||
|
||
* The VelocityResponseWriter is no longer built into the core. Its JAR and
|
||
dependencies now need to be added (via <lib> or solr/home lib inclusion),
|
||
and it needs to be registered in solrconfig.xml like this:
|
||
<queryResponseWriter name="velocity" class="solr.VelocityResponseWriter"/>
|
||
|
||
* The update request parameter to choose Update Request Processor Chain is
|
||
renamed from "update.processor" to "update.chain". The old parameter was
|
||
deprecated but still working since Solr3.2, but is now removed
|
||
entirely.
|
||
|
||
* The <indexDefaults> and <mainIndex> sections of solrconfig.xml are discontinued
|
||
and replaced with the <indexConfig> section. There are also better defaults.
|
||
When migrating, if you don't know what your old settings mean, simply delete
|
||
both <indexDefaults> and <mainIndex> sections. If you have customizations,
|
||
put them in <indexConfig> section - with same syntax as before.
|
||
|
||
* Two of the SolrServer subclasses in SolrJ were renamed/replaced.
|
||
CommonsHttpSolrServer is now HttpSolrServer, and
|
||
StreamingUpdateSolrServer is now ConcurrentUpdateSolrServer.
|
||
|
||
* The PingRequestHandler no longer looks for a <healthcheck/> option in the
|
||
(legacy) <admin> section of solrconfig.xml. Users who wish to take
|
||
advantage of this feature should configure a "healthcheckFile" init param
|
||
directly on the PingRequestHandler. As part of this change, relative file
|
||
paths have been fixed to be resolved against the data dir. See the example
|
||
solrconfig.xml and SOLR-1258 for more details.
|
||
|
||
* Due to low level changes to support SolrCloud, the uniqueKey field can no
|
||
longer be populated via <copyField/> or <field default=...> in the
|
||
schema.xml. Users wishing to have Solr automatically generate a uniqueKey
|
||
value when adding documents should instead use an instance of
|
||
solr.UUIDUpdateProcessorFactory in their update processor chain. See
|
||
SOLR-2796 for more details.
|
||
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* SOLR-3272: Solr filter factory for MorfologikFilter (Polish lemmatisation).
|
||
(Rafał Kuć via Dawid Weiss, Steven Rowe, Uwe Schindler).
|
||
|
||
* SOLR-571: The autowarmCount for LRUCaches (LRUCache and FastLRUCache) now
|
||
supports "percentages" which get evaluated relative the current size of
|
||
the cache when warming happens.
|
||
(Tomás Fernández Löbbe and hossman)
|
||
|
||
* SOLR-1932: New relevancy function queries: termfreq, tf, docfreq, idf
|
||
norm, maxdoc, numdocs. (yonik)
|
||
|
||
* SOLR-1665: Add debug component options for timings, results and query info only (gsingers, hossman, yonik)
|
||
|
||
* SOLR-2112: Solrj API now supports streaming results. (ryan)
|
||
|
||
* SOLR-792: Adding PivotFacetComponent for Hierarchical faceting
|
||
(ehatcher, Jeremy Hinegardner, Thibaut Lassalle, ryan)
|
||
|
||
* LUCENE-2507, SOLR-2571, SOLR-2576: Added DirectSolrSpellChecker, which uses Lucene's
|
||
DirectSpellChecker to retrieve correction candidates directly from the term dictionary using
|
||
levenshtein automata. (James Dyer, rmuir)
|
||
|
||
* SOLR-1873, SOLR-2358: SolrCloud - added shared/central config and core/shard management via zookeeper,
|
||
built-in load balancing, and distributed indexing.
|
||
(Jamie Johnson, Sami Siren, Ted Dunning, yonik, Mark Miller)
|
||
Additional Work:
|
||
- SOLR-2324: SolrCloud solr.xml parameters are not persisted by CoreContainer.
|
||
(Massimo Schiavon, Mark Miller)
|
||
- SOLR-2287: Allow users to query by multiple, compatible collections with SolrCloud.
|
||
(Soheb Mahmood, Alex Cowell, Mark Miller)
|
||
- SOLR-2622: ShowFileRequestHandler does not work in SolrCloud mode.
|
||
(Stefan Matheis, Mark Miller)
|
||
- SOLR-3108: Error in SolrCloud's replica lookup code when replica's are hosted in same Solr instance.
|
||
(Bruno Dumon, Sami Siren, Mark Miller)
|
||
- SOLR-3080: Remove shard info from zookeeper when SolrCore is explicitly unloaded.
|
||
(yonik, Mark Miller, siren)
|
||
- SOLR-3437: Recovery issues a spurious commit to the cluster. (Trym R. Møller via Mark Miller)
|
||
- SOLR-2822: Skip update processors already run on other nodes (hossman)
|
||
|
||
* SOLR-1566: Transforming documents in the ResponseWriters. This will allow
|
||
for more complex results in responses and open the door for function queries
|
||
as results.
|
||
(ryan with patches from grant, noble, cmale, yonik, Jan Høydahl,
|
||
Arul Kalaipandian, Luca Cavanna, hossman)
|
||
- SOLR-2037: Thanks to SOLR-1566, documents boosted by the QueryElevationComponent
|
||
can be marked as boosted. (gsingers, ryan, yonik)
|
||
|
||
* SOLR-2396: Add CollationField, which is much more efficient than
|
||
the Solr 3.x CollationKeyFilterFactory, and also supports
|
||
Locale-sensitive range queries. (rmuir)
|
||
|
||
* SOLR-2338: Add support for using <similarity/> in a schema's fieldType,
|
||
for customizing scoring on a per-field basis. (hossman, yonik, rmuir)
|
||
|
||
* SOLR-2335: New 'field("...")' function syntax for referring to complex
|
||
field names (containing whitespace or special characters) in functions.
|
||
|
||
* SOLR-2383: /browse improvements: generalize range and date facet display
|
||
(Jan Høydahl via yonik)
|
||
|
||
* SOLR-2272: Pseudo-join queries / filters. Examples:
|
||
- To restrict to the set of parents with at least one blue-eyed child:
|
||
fq={!join from=parent to=name}eyes:blue
|
||
- To restrict to the set of children with at least one blue-eyed parent:
|
||
fq={!join from=name to=parent}eyes:blue
|
||
(yonik)
|
||
|
||
* SOLR-1942: Added the ability to select postings format per fieldType in schema.xml
|
||
as well as support custom Codecs in solrconfig.xml.
|
||
(simonw via rmuir)
|
||
|
||
* SOLR-2136: Boolean type added to function queries, along with
|
||
new functions exists(), if(), and(), or(), xor(), not(), def(),
|
||
and true and false constants. (yonik)
|
||
|
||
* SOLR-2491: Add support for using spellcheck collation in conjunction
|
||
with grouping. Note that the number of hits returned for collations
|
||
is the number of ungrouped hits. (James Dyer via rmuir)
|
||
|
||
* SOLR-1298: Return FunctionQuery as pseudo field. The solr 'fl' param
|
||
now supports functions. For example: fl=id,sum(x,y) -- NOTE: only
|
||
functions with fast random access are recommended. (yonik, ryan)
|
||
|
||
* SOLR-705: Optionally return shard info with each document in distributed
|
||
search. Use fl=id,[shard] to return the shard url. (ryan)
|
||
|
||
* SOLR-2417: Add explain info directly to return documents using
|
||
?fl=id,[explain] (ryan)
|
||
|
||
* SOLR-2533: Converted ValueSource.ValueSourceSortField over to new rewriteable Lucene
|
||
SortFields. ValueSourceSortField instances must be rewritten before they can be used.
|
||
This is done by SolrIndexSearcher when necessary. (Chris Male).
|
||
|
||
* SOLR-2193, SOLR-2565: You may now specify a 'soft' commit when committing. This will
|
||
use Lucene's NRT feature to avoid guaranteeing documents are on stable storage in exchange
|
||
for faster reopen times. There is also a new 'soft' autocommit tracker that can be
|
||
configured. (Mark Miller, Robert Muir)
|
||
|
||
* SOLR-2399: Updated Solr Admin interface. New look and feel with per core administration
|
||
and many new options. (Stefan Matheis via ryan)
|
||
|
||
* SOLR-1032: CSV handler now supports "literal.field_name=value" parameters.
|
||
(Simon Rosenthal, ehatcher)
|
||
|
||
* SOLR-2656: realtime-get, efficiently retrieves the latest stored fields for specified
|
||
documents, even if they are not yet searchable (i.e. without reopening a searcher)
|
||
(yonik)
|
||
|
||
* SOLR-2703: Added support for Lucene's "surround" query parser. (Simon Rosenthal, ehatcher)
|
||
|
||
* SOLR-2754: Added factories for several ranking algorithms:
|
||
- BM25SimilarityFactory: Okapi BM25
|
||
- DFRSimilarityFactory: Divergence from Randomness models
|
||
- IBSimilarityFactory: Information-based models
|
||
- LMDirichletSimilarity: LM with Dirichlet smoothing
|
||
- LMJelinekMercerSimilarity: LM with Jelinek-Mercer smoothing
|
||
(David Mark Nemeskey, Robert Muir)
|
||
|
||
* SOLR-2134 Trie* fields should support sortMissingLast=true, and deprecate Sortable* Field Types
|
||
(Ryan McKinley, Mike McCandless, Uwe Schindler, Erick Erickson)
|
||
|
||
* SOLR-2438 added MultiTermAwareComponent to the various classes to allow automatic lowercasing
|
||
for multiterm queries (wildcards, regex, prefix, range, etc). You can now optionally specify a
|
||
"multiterm" analyzer in our schema.xml, but Solr should "do the right thing" if you don't
|
||
specify <analyzer type="multiterm"> (Pete Sturge Erick Erickson, Mentoring from Seeley and Muir)
|
||
|
||
* SOLR-2481: Add support for commitWithin in DataImportHandler (Sami Siren via yonik)
|
||
|
||
* SOLR-2992: Add support for IndexWriter.prepareCommit() via prepareCommit=true
|
||
on update URLs. (yonik)
|
||
|
||
* SOLR-2906: Added LFU cache options to Solr. (Shawn Heisey via Erick Erickson)
|
||
|
||
* SOLR-3069: Ability to add openSearcher=false to not open a searcher when doing
|
||
a hard commit. commitWithin now only invokes a softCommit. (yonik)
|
||
|
||
* SOLR-2802: New FieldMutatingUpdateProcessor and Factory to simplify the
|
||
development of UpdateProcessors that modify field values of documents as
|
||
they are indexed. Also includes several useful new implementations:
|
||
- RemoveBlankFieldUpdateProcessorFactory
|
||
- TrimFieldUpdateProcessorFactory
|
||
- HTMLStripFieldUpdateProcessorFactory
|
||
- RegexReplaceProcessorFactory
|
||
- FieldLengthUpdateProcessorFactory
|
||
- ConcatFieldUpdateProcessorFactory
|
||
- FirstFieldValueUpdateProcessorFactory
|
||
- LastFieldValueUpdateProcessorFactory
|
||
- MinFieldValueUpdateProcessorFactory
|
||
- MaxFieldValueUpdateProcessorFactory
|
||
- TruncateFieldUpdateProcessorFactory
|
||
- IgnoreFieldUpdateProcessorFactory
|
||
(hossman, janhoy)
|
||
|
||
* SOLR-3120: Optional post filtering for spatial queries bbox and geofilt
|
||
for LatLonType. (yonik)
|
||
|
||
* SOLR-2459: Expose LogLevel selection with a RequestHandler rather then servlet
|
||
(Stefan Matheis, Upayavira, ryan)
|
||
|
||
* SOLR-3134: Include shard info in distributed response when shards.info=true
|
||
(Russell Black, ryan)
|
||
|
||
* SOLR-2898: Support grouped faceting. (Martijn van Groningen)
|
||
Additional Work:
|
||
- SOLR-3406: Extended grouped faceting support to facet.query and facet.range parameters.
|
||
(David Boychuck, Martijn van Groningen)
|
||
|
||
* SOLR-2949: QueryElevationComponent is now supported with distributed search.
|
||
(Mark Miller, yonik)
|
||
|
||
* SOLR-3221: Added the ability to directly configure aspects of the concurrency
|
||
and thread-pooling used within distributed search in solr. This allows for finer
|
||
grained controlled and can be tuned by end users to target their own specific
|
||
requirements. This builds on the work of the HttpCommComponent and uses the same configuration
|
||
block to configure the thread pool. The default configuration has
|
||
the same behaviour as solr 3.5, favouring throughput over latency. More
|
||
information can be found on the wiki (http://wiki.apache.org/solr/SolrConfigXml) (Greg Bowyer)
|
||
|
||
* SOLR-3278: Negative boost support to the Extended Dismax Query Parser Boost Query (bq).
|
||
(James Dyer)
|
||
|
||
* SOLR-3255: OpenExchangeRates.Org Exchange Rate Provider for CurrencyField (janhoy)
|
||
|
||
* SOLR-3358: Logging events are captured and available from the /admin/logging
|
||
request handler. (ryan)
|
||
|
||
* SOLR-1535: PreAnalyzedField type provides a functionality to index (and optionally store)
|
||
field content that was already processed and split into tokens using some external processing
|
||
chain. Serialization format is pluggable, and defaults to JSON. (ab)
|
||
|
||
* SOLR-3363: Consolidated Exceptions in Analysis Factories so they only throw
|
||
InitializationExceptions (Chris Male)
|
||
|
||
* SOLR-2690: New support for a "TZ" request param which overrides the TimeZone
|
||
used when rounding Dates in DateMath expressions for the entire request
|
||
(all date range queries and date faceting is affected). The default TZ
|
||
is still UTC. (David Schlotfeldt, hossman)
|
||
|
||
* SOLR-3402: Analysis Factories are now configured with their Lucene Version
|
||
throw setLuceneMatchVersion, rather than through the Map passed to init.
|
||
Parsing and simple error checking for the Version is now done inside
|
||
the code that creates the Analysis Factories. (Chris Male)
|
||
|
||
* SOLR-3178: Optimistic locking. If a _version_ is provided with an update
|
||
that does not match the version in the index, an HTTP 409 error (Conflict)
|
||
will result. (Per Steffensen, yonik)
|
||
|
||
* SOLR-139: Updateable documents. JSON Example:
|
||
{"id":"mydoc", "f1":{"set":10}, "f2":{"add":20}} will result in field "f1"
|
||
being set to 10, "f2" having an additional value of 20 added, and all
|
||
other existing fields unchanged. All source fields must be stored for
|
||
this feature to work correctly. (Ryan McKinley, Erik Hatcher, yonik)
|
||
|
||
* SOLR-2857: Support XML,CSV,JSON, and javabin in a single RequestHandler and
|
||
choose the correct ContentStreamLoader based on Content-Type header. This
|
||
also deprecates the existing [Xml,JSON,CSV,Binary,Xslt]UpdateRequestHandler.
|
||
(ryan)
|
||
|
||
* SOLR-2585: Context-Sensitive Spelling Suggestions & Collations. This adds support
|
||
for the "spellcheck.alternativeTermCount" & "spellcheck.maxResultsForSuggest"
|
||
parameters, letting users receive suggestions even when all the queried terms
|
||
exist in the dictionary. This differs from "spellcheck.onlyMorePopular" in
|
||
that the suggestions need not consist entirely of terms with a greater document
|
||
frequency than the queried terms. (James Dyer)
|
||
|
||
* SOLR-2058: Edismax query parser to allow "phrase slop" to be specified per-field
|
||
on the pf/pf2/pf3 parameters using optional "FieldName~slop^boost" syntax. The
|
||
prior "FieldName^boost" syntax is still accepted. In such cases the value on the
|
||
"ps" parameter serves as the default slop. (Ron Mayer via James Dyer)
|
||
|
||
* SOLR-3495: New UpdateProcessors have been added to create default values for
|
||
configured fields. These works similarly to the <field default="..."/>
|
||
option in schema.xml, but are applied in the UpdateProcessorChain, so they
|
||
may be used prior to other UpdateProcessors, or to generate a uniqueKey field
|
||
value when using the DistributedUpdateProcessor (ie: SolrCloud)
|
||
TimestampUpdateProcessorFactory
|
||
UUIDUpdateProcessorFactory
|
||
DefaultValueUpdateProcessorFactory
|
||
(hossman)
|
||
|
||
* SOLR-2993: Add WordBreakSolrSpellChecker to offer suggestions by combining adjacent
|
||
query terms and/or breaking terms into multiple words. This spellchecker can be
|
||
configured with a traditional checker (ie: DirectSolrSpellChecker). The results
|
||
are combined and collations can contain a mix of corrections from both spellcheckers.
|
||
(James Dyer)
|
||
|
||
* SOLR-3508: Simplify JSON update format for deletes as well as allow
|
||
version specification for optimistic locking. Examples:
|
||
- {"delete":"myid"}
|
||
- {"delete":["id1","id2","id3"]}
|
||
- {"delete":{"id":"myid", "_version_":123456789}}
|
||
(yonik)
|
||
|
||
* SOLR-3211: Allow parameter overrides in conjunction with "spellcheck.maxCollationTries".
|
||
To do so, use parameters starting with "spellcheck.collateParam." For instance, to
|
||
override the "mm" parameter, specify "spellcheck.collateParam.mm". This is helpful
|
||
in cases where testing spellcheck collations for result counts should use different
|
||
parameters from the main query (James Dyer)
|
||
|
||
* SOLR-2599: CloneFieldUpdateProcessorFactory provides similar functionality
|
||
to schema.xml's <copyField/> declaration but as an update processor that can
|
||
be combined with other processors in any order. (Jan Høydahl & hossman)
|
||
|
||
* SOLR-3351: eDismax: ps2 and ps3 params (janhoy)
|
||
|
||
* SOLR-3542: Add WeightedFragListBuilder for FVH and set it to default fragListBuilder
|
||
in example solrconfig.xml. (Sebastian Lutze, koji)
|
||
|
||
* SOLR-2396: Add ICUCollationField to contrib/analysis-extras, which is much
|
||
more efficient than the Solr 3.x ICUCollationKeyFilterFactory, and also
|
||
supports Locale-sensitive range queries. (rmuir)
|
||
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* SOLR-1875: Per-segment field faceting for single valued string fields.
|
||
Enable with facet.method=fcs, control the number of threads used with
|
||
the "threads" local param on the facet.field param. This algorithm will
|
||
only be faster in the presence of rapid index changes. (yonik)
|
||
|
||
* SOLR-1904: When facet.enum.cache.minDf > 0 and the base doc set is a
|
||
SortedIntSet, convert to HashDocSet for better performance. (yonik)
|
||
|
||
* SOLR-2092: Speed up single-valued and multi-valued "fc" faceting. Typical
|
||
improvement is 5%, but can be much greater (up to 10x faster) when facet.offset
|
||
is very large (deep paging). (yonik)
|
||
|
||
* SOLR-2193, SOLR-2565: The default Solr update handler has been improved so
|
||
that it uses fewer locks, keeps the IndexWriter open rather than closing it
|
||
on each commit (ie commits no longer wait for background merges to complete),
|
||
works with SolrCore to provide faster 'soft' commits, and has an improved API
|
||
that requires less instanceof special casing. (Mark Miller, Robert Muir)
|
||
Additional Work:
|
||
- SOLR-2697: commit and autocommit operations don't reset
|
||
DirectUpdateHandler2.numDocsPending stats attribute.
|
||
(Alexey Serba, Mark Miller)
|
||
|
||
* SOLR-2950: The QueryElevationComponent now avoids using the FieldCache and looking up
|
||
every document id (gsingers, yonik)
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
* SOLR-3139: Make ConcurrentUpdateSolrServer send UpdateRequest.getParams()
|
||
as HTTP request params (siren)
|
||
|
||
* SOLR-3165: Cannot use DIH in Solrcloud + Zookeeper (Alexey Serba,
|
||
Mark Miller, siren)
|
||
|
||
* SOLR-3068: Occasional NPE in ThreadDumpHandler (siren)
|
||
|
||
* SOLR-2762: FSTLookup could return duplicate results or one results less
|
||
than requested. (David Smiley, Dawid Weiss)
|
||
|
||
* SOLR-2741: Bugs in facet range display in trunk (janhoy)
|
||
|
||
* SOLR-1908: Fixed SignatureUpdateProcessor to fail to initialize on
|
||
invalid config. Specifically: a signatureField that does not exist,
|
||
or overwriteDupes=true with a signatureField that is not indexed.
|
||
(hossman)
|
||
|
||
* SOLR-1824: IndexSchema will now fail to initialize if there is a
|
||
problem initializing one of the fields or field types. (hossman)
|
||
|
||
* SOLR-1928: TermsComponent didn't correctly break ties for non-text
|
||
fields sorted by count. (yonik)
|
||
|
||
* SOLR-2107: MoreLikeThisHandler doesn't work with alternate qparsers. (yonik)
|
||
|
||
* SOLR-2108: Fixed false positives when using wildcard queries on fields with reversed
|
||
wildcard support. For example, a query of *zemog* would match documents that contain
|
||
'gomez'. (Landon Kuhn via Robert Muir)
|
||
|
||
* SOLR-1962: SolrCore#initIndex should not use a mix of indexPath and newIndexPath (Mark Miller)
|
||
|
||
* SOLR-2275: fix DisMax 'mm' parsing to be tolerant of whitespace
|
||
(Erick Erickson via hossman)
|
||
|
||
* SOLR-2193, SOLR-2565, SOLR-2651: SolrCores now properly share IndexWriters across SolrCore reloads.
|
||
(Mark Miller, Robert Muir)
|
||
Additional Work:
|
||
- SOLR-2705: On reload, IndexWriterProvider holds onto the initial SolrCore it was created with.
|
||
(Yury Kats, Mark Miller)
|
||
|
||
* SOLR-2682: Remove addException() in SimpleFacet. FacetComponent no longer catches and embeds
|
||
exceptions occurred during facet processing, it throws HTTP 400 or 500 exceptions instead. (koji)
|
||
|
||
* SOLR-2654: Directorys used by a SolrCore are now closed when they are no longer used.
|
||
(Mark Miller)
|
||
|
||
* SOLR-2854: Now load URL content stream data (via stream.url) when called for during request handling,
|
||
rather than loading URL content streams automatically regardless of use.
|
||
(David Smiley and Ryan McKinley via ehatcher)
|
||
|
||
* SOLR-2829: Fix problem with false-positives due to incorrect
|
||
equals methods. (Yonik Seeley, Hossman, Erick Erickson.
|
||
Marc Tinnemeyer caught the bug)
|
||
|
||
* SOLR-2848: Removed 'instanceof AbstractLuceneSpellChecker' hacks from distributed spellchecking code,
|
||
and added a merge() method to SolrSpellChecker instead. Previously if you extended SolrSpellChecker
|
||
your spellchecker would not work in distributed fashion. (James Dyer via rmuir)
|
||
|
||
* SOLR-2509: StringIndexOutOfBoundsException in the spellchecker collate when the term contains
|
||
a hyphen. (Thomas Gambier caught the bug, Steffen Godskesen did the patch, via Erick Erickson)
|
||
|
||
* SOLR-1730: Made it clearer when a core failed to load as well as better logging when the
|
||
QueryElevationComponent fails to properly initialize (gsingers)
|
||
|
||
* SOLR-1520: QueryElevationComponent now supports non-string ids (gsingers)
|
||
|
||
* SOLR-3037: When using binary format in solrj the codec screws up parameters
|
||
(Sami Siren, Jörg Maier via yonik)
|
||
|
||
* SOLR-3062: A join in the main query was not respecting any filters pushed
|
||
down to it via acceptDocs since LUCENE-1536. (Mike Hugo, yonik)
|
||
|
||
* SOLR-3214: If you use multiple fl entries rather than a comma separated list, all but the first
|
||
entry can be ignored if you are using distributed search. (Tomás Fernández Löbbe via Mark Miller)
|
||
|
||
* SOLR-3352: eDismax: pf2 should kick in for a query with 2 terms (janhoy)
|
||
|
||
* SOLR-3361: ReplicationHandler "maxNumberOfBackups" doesn't work if backups are triggered on commit
|
||
(James Dyer, Tomás Fernández Löbbe)
|
||
|
||
* SOLR-2605: fixed tracking of the 'defaultCoreName' in CoreContainer so that
|
||
CoreAdminHandler could return consistent information regardless of whether
|
||
there is a a default core name or not. (steffkes, hossman)
|
||
|
||
* SOLR-3370: fixed CSVResponseWriter to respect globs in the 'fl' param
|
||
(Keith Fligg via hossman)
|
||
|
||
* SOLR-3436: Group count incorrect when not all shards are queried in the second
|
||
pass. (Francois Perron, Martijn van Groningen)
|
||
|
||
* SOLR-3454: Exception when using result grouping with main=true and using
|
||
wt=javabin. (Ludovic Boutros, Martijn van Groningen)
|
||
|
||
* SOLR-3446: Better errors when PatternTokenizerFactory is configured with
|
||
an invalid pattern, and include the 'name' whenever possible in plugin init
|
||
error messages. (hossman)
|
||
|
||
* LUCENE-4075: Cleaner path usage in TestXPathEntityProcessor
|
||
(Greg Bowyer via hossman)
|
||
|
||
* SOLR-2923: IllegalArgumentException when using useFilterForSortedQuery on an
|
||
empty index. (Adrien Grand via Mark Miller)
|
||
|
||
* SOLR-2352: Fixed TermVectorComponent so that it will not fail if the fl
|
||
param contains globs or psuedo-fields (hossman)
|
||
|
||
* SOLR-3541: add missing solrj dependencies to binary packages.
|
||
(Thijs Vonk via siren)
|
||
|
||
* SOLR-3522: fixed parsing of the 'literal()' function (hossman)
|
||
|
||
* SOLR-3548: Fixed a bug in the cachability of queries using the {!join}
|
||
parser or the strdist() function, as well as some minor improvements to
|
||
the hashCode implementation of {!bbox} and {!geofilt} queries.
|
||
(hossman)
|
||
|
||
* SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories
|
||
are respected now (Stanislaw Osinski, Dawid Weiss)
|
||
|
||
* SOLR-3430: Added a new DIH test against a real SQL database. Fixed problems
|
||
revealed by this new test related to the expanded cache support added to
|
||
3.6/SOLR-2382 (James Dyer)
|
||
|
||
* SOLR-1958: When using the MailEntityProcessor, import would fail if
|
||
fetchMailsSince was not specified. (Max Lynch via James Dyer)
|
||
|
||
* SOLR-4289: Admin UI - JVM memory bar - dark grey "used" width is too small
|
||
(steffkes, elyograg)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-1846: Eliminate support for the abortOnConfigurationError
|
||
option. It has never worked very well, and in recent versions of
|
||
Solr hasn't worked at all. (hossman)
|
||
|
||
* SOLR-1889: The default logic for the 'mm' param of DismaxQParser and
|
||
ExtendedDismaxQParser has been changed to be determined based on the
|
||
effective value of the 'q.op' param (hossman)
|
||
|
||
* SOLR-1946: Misc improvements to the SystemInfoHandler: /admin/system
|
||
(hossman)
|
||
|
||
* SOLR-2289: Tweak spatial coords for example docs so they are a bit
|
||
more spread out (Erick Erickson via hossman)
|
||
|
||
* SOLR-2288: Small tweaks to eliminate compiler warnings. primarily
|
||
using Generics where applicable in method/object declarations, and
|
||
adding @SuppressWarnings("unchecked") when appropriate (hossman)
|
||
|
||
* SOLR-2375: Suggester Lookup implementations now store trie data
|
||
and load it back on init. This means that large tries don't have to be
|
||
rebuilt on every commit or core reload. (ab)
|
||
|
||
* SOLR-2413: Support for returning multi-valued fields w/o <arr> tag
|
||
in the XMLResponseWriter was removed. XMLResponseWriter only
|
||
no longer work with values less then 2.2 (ryan)
|
||
|
||
* SOLR-2423: FieldType argument changed from String to Object
|
||
Conversion from SolrInputDocument > Object > Fieldable is now managed
|
||
by FieldType rather then DocumentBuilder. (ryan)
|
||
|
||
* SOLR-2461: QuerySenderListener and AbstractSolrEventListener are
|
||
now public (hossman)
|
||
|
||
* LUCENE-2995: Moved some spellchecker and suggest APIs to modules/suggest:
|
||
HighFrequencyDictionary, SortedIterator, TermFreqIterator, and the
|
||
suggester APIs and implementations. (rmuir)
|
||
|
||
* SOLR-2576: Remove deprecated SpellingResult.add(Token, int).
|
||
(James Dyer via rmuir)
|
||
|
||
* LUCENE-3232: Moved MutableValue classes to new 'common' module. (Chris Male)
|
||
|
||
* LUCENE-2883: FunctionQuery, DocValues (and its impls), ValueSource (and its
|
||
impls) and BoostedQuery have been consolidated into the queries module. They
|
||
can now be found at o.a.l.queries.function.
|
||
|
||
* SOLR-2027: FacetField.getValues() now returns an empty list if there are no
|
||
values, instead of null (Chris Male)
|
||
|
||
* SOLR-1825: SolrQuery.addFacetQuery now enables facets automatically, like
|
||
addFacetField (Chris Male)
|
||
|
||
* SOLR-2663: FieldTypePluginLoader has been refactored out of IndexSchema
|
||
and made public. (hossman)
|
||
|
||
* SOLR-2331,SOLR-2691: Refactor CoreContainer's SolrXML serialization code and improve testing
|
||
(Yury Kats, hossman, Mark Miller)
|
||
|
||
* SOLR-2698: Enhance CoreAdmin STATUS command to return index size.
|
||
(Yury Kats, hossman, Mark Miller)
|
||
|
||
* SOLR-2654: The same Directory instance is now always used across a SolrCore so that
|
||
it's easier to add other DirectoryFactory's without static caching hacks.
|
||
(Mark Miller)
|
||
|
||
* LUCENE-3286: 'luke' ant target has been disabled due to incompatibilities with XML
|
||
queryparser location (Chris Male)
|
||
|
||
* SOLR-1897: The data dir from the core descriptor should override the data dir from
|
||
the solrconfig.xml rather than the other way round. (Mark Miller)
|
||
|
||
* SOLR-2756: Maven configuration: Excluded transitive stax:stax-api dependency
|
||
from org.codehaus.woodstox:wstx-asl dependency. (David Smiley via Steve Rowe)
|
||
|
||
* SOLR-2588: Moved VelocityResponseWriter back to contrib module in order to
|
||
remove it as a mandatory core dependency. (ehatcher)
|
||
|
||
* SOLR-2862: More explicit lexical resources location logged if Carrot2 clustering
|
||
extension is used. Fixed solr. impl. of IResource and IResourceLookup. (Dawid Weiss)
|
||
|
||
* SOLR-1123: Changed JSONResponseWriter to now use application/json as its Content-Type
|
||
by default. However the Content-Type can be overwritten and is set to text/plain in
|
||
the example configuration. (Uri Boness, Chris Male)
|
||
|
||
* SOLR-2607: Removed deprecated client/ruby directory, which included solr-ruby and flare.
|
||
(ehatcher)
|
||
|
||
* SOLR-3032: logOnce from SolrException logOnce and all the supporting
|
||
structure is gone. abortOnConfigurationError is also gone as it is no longer referenced.
|
||
Errors should be caught and logged at the top-most level or logged and NOT propagated up the
|
||
chain. (Erick Erickson)
|
||
|
||
* SOLR-2105: Remove support for deprecated "update.processor" (since 3.2), in favor of
|
||
"update.chain" (janhoy)
|
||
|
||
* SOLR-3005: Default QueryResponseWriters are now initialized via init() with an empty
|
||
NamedList. (Gasol Wu, Chris Male)
|
||
|
||
* SOLR-2607: Removed obsolete client/ folder (ehatcher, Eric Pugh, janhoy)
|
||
|
||
* SOLR-3202, SOLR-3244: Dropping Support for JSP. New Admin UI is all client side
|
||
(ryan, Aliaksandr Zhuhrou, Uwe Schindler)
|
||
|
||
* SOLR-3159: Upgrade example and tests to run with Jetty 8 (ryan)
|
||
|
||
* SOLR-3254: Upgrade Solr to Tika 1.1 (janhoy)
|
||
|
||
* SOLR-3329: Dropped getSourceID() from SolrInfoMBean and using
|
||
getClass().getPackage().getSpecificationVersion() for Version. (ryan)
|
||
|
||
* SOLR-3302: Upgraded SLF4j to version 1.6.4 (hossman)
|
||
|
||
* SOLR-3322: Add more context to IndexReaderFactory.newReader (ab)
|
||
|
||
* SOLR-3343: Moved FastWriter, FileUtils, RegexFileFilter, RTimer and SystemIdResolver
|
||
from org.apache.solr.common to org.apache.solr.util (Chris Male)
|
||
|
||
* SOLR-3357: ResourceLoader.newInstance now accepts a Class representation of the expected
|
||
instance type (Chris Male)
|
||
|
||
* SOLR-3388: HTTP caching is now disabled by default for RequestUpdateHandlers. (ryan)
|
||
|
||
* SOLR-3309: web.xml now specifies metadata-complete=true (which requires
|
||
Servlet 2.5) to prevent servlet containers from scanning class annotations
|
||
on startup. This allows for faster startup times on some servlet containers.
|
||
(Bill Bell, hossman)
|
||
|
||
* SOLR-1893: Refactored some common code from LRUCache and FastLRUCache into
|
||
SolrCacheBase (Tomás Fernández Löbbe via hossman)
|
||
|
||
* SOLR-3403: Deprecated Analysis Factories now log their own deprecation messages.
|
||
No logging support is provided by Factory parent classes. (Chris Male)
|
||
|
||
* SOLR-1258: PingRequestHandler is now directly configured with a
|
||
"healthcheckFile" instead of looking for the legacy
|
||
<admin><healthcheck/></admin> syntax. Filenames specified as relative
|
||
paths have been fixed so that they are resolved against the data dir
|
||
instead of the CWD of the java process. (hossman)
|
||
|
||
* SOLR-3083: JMX beans now report Numbers as numeric values rather then String
|
||
(Tagged Siteops, Greg Bowyer via ryan)
|
||
|
||
* SOLR-2796: Due to low level changes to support SolrCloud, the uniqueKey
|
||
field can no longer be populated via <copyField/> or <field default=...>
|
||
in the schema.xml.
|
||
|
||
* SOLR-3534: The Dismax and eDismax query parsers will fall back on the 'df' parameter
|
||
when 'qf' is absent. And if neither is present nor the schema default search field
|
||
then an exception will be thrown now. (dsmiley)
|
||
|
||
* SOLR-3262: The "threads" feature of DIH is removed (deprecated in Solr 3.6)
|
||
(James Dyer)
|
||
|
||
* SOLR-3422: Refactored DIH internal data classes. All entities in
|
||
data-config.xml must have a name (James Dyer)
|
||
|
||
Documentation
|
||
----------------------
|
||
|
||
* SOLR-2232: Improved README info on solr.solr.home in examples
|
||
(Eric Pugh and hossman)
|
||
|
||
================== 3.6.2 ==================
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
* SOLR-3790: ConcurrentModificationException could be thrown when using hl.fl=*.
|
||
(yonik, koji)
|
||
|
||
* SOLR-3589: Edismax parser does not honor mm parameter if analyzer splits a token.
|
||
(Tom Burton-West, Robert Muir)
|
||
|
||
================== 3.6.1 ==================
|
||
More information about this release, including any errata related to the
|
||
release notes, upgrade instructions, or other changes may be found online at:
|
||
https://wiki.apache.org/solr/Solr3.6.1
|
||
|
||
Bug Fixes
|
||
|
||
* LUCENE-3969: Throw IAE on bad arguments that could cause confusing errors in
|
||
PatternTokenizer. CommonGrams populates PositionLengthAttribute correctly.
|
||
(Uwe Schindler, Mike McCandless, Robert Muir)
|
||
|
||
* SOLR-3361: ReplicationHandler "maxNumberOfBackups" doesn't work if backups are triggered on commit
|
||
(James Dyer, Tomás Fernández Löbbe)
|
||
|
||
* SOLR-3375: Fix charset problems with HttpSolrServer (Roger Håkansson, yonik, siren)
|
||
|
||
* SOLR-3436: Group count incorrect when not all shards are queried in the second
|
||
pass. (Francois Perron, Martijn van Groningen)
|
||
|
||
* SOLR-3454: Exception when using result grouping with main=true and using
|
||
wt=javabin. (Ludovic Boutros, Martijn van Groningen)
|
||
|
||
* SOLR-3489: Config file replication less error prone (Jochen Just via janhoy)
|
||
|
||
* SOLR-3477: SOLR does not start up when no cores are defined (Tomás Fernández Löbbe via tommaso)
|
||
|
||
* SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories
|
||
are respected now (Stanislaw Osinski, Dawid Weiss)
|
||
|
||
* SOLR-3360: More DIH bug fixes for the deprecated "threads" parameter.
|
||
(Mikhail Khludnev, Claudio R, via James Dyer)
|
||
|
||
* SOLR-3430: Added a new DIH test against a real SQL database. Fixed problems
|
||
revealed by this new test related to the expanded cache support added to
|
||
3.6/SOLR-2382 (James Dyer)
|
||
|
||
* SOLR-3336: SolrEntityProcessor substitutes most variables at query time.
|
||
(Michael Kroh, Lance Norskog, via Martijn van Groningen)
|
||
|
||
|
||
================== 3.6.0 ==================
|
||
More information about this release, including any errata related to the
|
||
release notes, upgrade instructions, or other changes may be found online at:
|
||
https://wiki.apache.org/solr/Solr3.6
|
||
|
||
Upgrading from Solr 3.5
|
||
----------------------
|
||
* SOLR-2983: As a consequence of moving the code which sets a MergePolicy from SolrIndexWriter to SolrIndexConfig,
|
||
(custom) MergePolicies should now have an empty constructor; thus an IndexWriter should not be passed as constructor
|
||
parameter but instead set using the setIndexWriter() method.
|
||
|
||
* As doGet() methods in SimplePostTool was changed to static, the client applications of this
|
||
class need to be recompiled.
|
||
|
||
* In Solr version 3.5 and earlier, HTMLStripCharFilter had known bugs in the
|
||
character offsets it provided, triggering e.g. exceptions in highlighting.
|
||
HTMLStripCharFilter has been re-implemented, addressing this and other
|
||
issues. See the entry for LUCENE-3690 in the Bug Fixes section below for a
|
||
detailed list of changes. For people who depend on the behavior of
|
||
HTMLStripCharFilter in Solr version 3.5 and earlier: the old implementation
|
||
(bugs and all) is preserved as LegacyHTMLStripCharFilter.
|
||
|
||
* As of Solr 3.6, the <indexDefaults> and <mainIndex> sections of solrconfig.xml are deprecated
|
||
and replaced with a new <indexConfig> section. Read more in SOLR-1052 below.
|
||
|
||
* SOLR-3040: The DIH's admin UI (dataimport.jsp) now requires DIH request handlers to start with
|
||
a '/'. (dsmiley)
|
||
|
||
* SOLR-3161: <requestDispatcher handleSelect="false"> is now the default. An existing config will
|
||
probably work as-is because handleSelect was explicitly enabled in default configs. HandleSelect
|
||
makes /select work as well as enables the 'qt' parameter. Instead, consider explicitly
|
||
configuring /select as is done in the example solrconfig.xml, and register your other search
|
||
handlers with a leading '/' which is a recommended practice. (David Smiley, Erik Hatcher)
|
||
|
||
* SOLR-3161: Don't use the 'qt' parameter with a leading '/'. It probably won't work in 4.0
|
||
and it's now limited in 3.6 to SearchHandler subclasses that aren't lazy-loaded.
|
||
|
||
* SOLR-2724: Specifying <defaultSearchField> and <solrQueryParser defaultOperator="..."/> in
|
||
schema.xml is now considered deprecated. Instead you are encouraged to specify these via the "df"
|
||
and "q.op" parameters in your request handler definition. (David Smiley)
|
||
|
||
* Bugs found and fixed in the SignatureUpdateProcessor that previously caused
|
||
some documents to produce the same signature even when the configured fields
|
||
contained distinct (non-String) values. Users of SignatureUpdateProcessor
|
||
are strongly advised that they should re-index as document signatures may
|
||
have now changed. (see SOLR-3200 & SOLR-3226 for details)
|
||
|
||
New Features
|
||
----------------------
|
||
* SOLR-2020: Add Java client that uses Apache Http Components http client (4.x).
|
||
(Chantal Ackermann, Ryan McKinley, Yonik Seeley, siren)
|
||
|
||
* SOLR-2854: Now load URL content stream data (via stream.url) when called for during request handling,
|
||
rather than loading URL content streams automatically regardless of use.
|
||
(David Smiley and Ryan McKinley via ehatcher)
|
||
|
||
* SOLR-2904: BinaryUpdateRequestHandler should be able to accept multiple update requests from
|
||
a stream (shalin)
|
||
|
||
* SOLR-1565: StreamingUpdateSolrServer supports RequestWriter API and therefore, javabin update
|
||
format (shalin)
|
||
|
||
* SOLR-2438 added MultiTermAwareComponent to the various classes to allow automatic lowercasing
|
||
for multiterm queries (wildcards, regex, prefix, range, etc). You can now optionally specify a
|
||
"multiterm" analyzer in our schema.xml, but Solr should "do the right thing" if you don't
|
||
specify <fieldType="multiterm"> (Pete Sturge Erick Erickson, Mentoring from Seeley and Muir)
|
||
|
||
* SOLR-2919: Added support for localized range queries when the analysis chain uses
|
||
CollationKeyFilter or ICUCollationKeyFilter. (Michael Sokolov, rmuir)
|
||
|
||
* SOLR-2982: Added BeiderMorseFilterFactory for Beider-Morse (BMPM) phonetic encoder. Upgrades
|
||
commons-codec to version 1.6 (Brooke Schreier Ganz, rmuir)
|
||
|
||
* SOLR-1843: A new "rootName" attribute is now available when
|
||
configuring <jmx/> in solrconfig.xml. If this attribute is set,
|
||
Solr will use it as the root name for all MBeans Solr exposes via
|
||
JMX. The default root name is "solr" followed by the core name.
|
||
(Constantijn Visinescu, hossman)
|
||
|
||
* SOLR-2906: Added LFU cache options to Solr. (Shawn Heisey via Erick Erickson)
|
||
|
||
* SOLR-3036: Ability to specify overwrite=false on the URL for XML updates.
|
||
(Sami Siren via yonik)
|
||
|
||
* SOLR-2603: Add the encoding function for alternate fields in highlighting.
|
||
(Massimo Schiavon, koji)
|
||
|
||
* SOLR-1729: Evaluation of NOW for date math is done only once per request for
|
||
consistency, and is also propagated to shards in distributed search.
|
||
Adding a parameter NOW=<time_in_ms> to the request will override the
|
||
current time. (Peter Sturge, yonik, Simon Willnauer)
|
||
|
||
* SOLR-1709: Distributed support for Date and Numeric Range Faceting
|
||
(Peter Sturge, David Smiley, hossman, Simon Willnauer)
|
||
|
||
* SOLR-3054, LUCENE-3671: Add TypeTokenFilterFactory that creates TypeTokenFilter
|
||
that filters tokens based on their TypeAttribute. (Tommaso Teofili via
|
||
Uwe Schindler)
|
||
|
||
* LUCENE-3305, SOLR-3056: Added Kuromoji morphological analyzer for Japanese.
|
||
See the 'text_ja' fieldtype in the example to get started.
|
||
(Christian Moen, Masaru Hasegawa via Robert Muir)
|
||
|
||
* SOLR-1860: StopFilterFactory, CommonGramsFilterFactory, and
|
||
CommonGramsQueryFilterFactory can optionally read stopwords in Snowball
|
||
format (specify format="snowball"). (Robert Muir)
|
||
|
||
* SOLR-3105: ElisionFilterFactory optionally allows the parameter
|
||
ignoreCase (default=false). (Robert Muir)
|
||
|
||
* LUCENE-3714: Add WFSTLookupFactory, a suggester that uses a weighted FST
|
||
for more fine-grained suggestions. (Mike McCandless, Dawid Weiss, Robert Muir)
|
||
|
||
* SOLR-3143: Add SuggestQueryConverter, a QueryConverter intended for
|
||
auto-suggesters. (Robert Muir)
|
||
|
||
* SOLR-3033: ReplicationHandler's backup command now supports a 'maxNumberOfBackups'
|
||
init param that can be used to delete all but the most recent N backups. (Torsten Krah, James Dyer)
|
||
|
||
* SOLR-2202: Currency FieldType, whith support for currencies and exchange rates
|
||
(Greg Fodor & Andrew Morrison via janhoy, rmuir, Uwe Schindler)
|
||
|
||
* SOLR-3026: eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
|
||
(janhoy, hossmann, Tomás Fernández Löbbe)
|
||
|
||
* SOLR-2826: URLClassify Update Processor (janhoy)
|
||
|
||
* SOLR-2764: Create a NorwegianLightStemmer and NorwegianMinimalStemmer (janhoy)
|
||
|
||
* SOLR-3221: Added the ability to directly configure aspects of the concurrency
|
||
and thread-pooling used within distributed search in solr. This allows for finer
|
||
grained controlled and can be tuned by end users to target their own specific
|
||
requirements. This builds on the work of the HttpCommComponent and uses the same configuration
|
||
block to configure the thread pool. The default configuration has
|
||
the same behaviour as solr 3.5, favouring throughput over latency. More
|
||
information can be found on the wiki (http://wiki.apache.org/solr/SolrConfigXml) (Greg Bowyer)
|
||
|
||
* SOLR-2001: The query component will substitute an empty query that matches
|
||
no documents if the query parser returns null. This also prevents an
|
||
exception from being thrown by the default parser if "q" is missing. (yonik)
|
||
- SOLR-435: if q is "" then it's also acceptable. (dsmiley, hoss)
|
||
|
||
* SOLR-2919: Added parametric tailoring options to ICUCollationKeyFilterFactory.
|
||
These can be used to customize range query/sort behavior, for example to
|
||
support numeric collation, ignore punctuation/whitespace, ignore accents but
|
||
not case, control whether upper/lowercase values are sorted first, etc. (rmuir)
|
||
|
||
* SOLR-2346: Add a chance to set content encoding explicitly via content type
|
||
of stream for extracting request handler. This is convenient when Tika's
|
||
auto detector cannot detect encoding, especially the text file is too short
|
||
to detect encoding. (koji)
|
||
|
||
* SOLR-1499: Added SolrEntityProcessor that imports data from another Solr core
|
||
or instance based on a specified query.
|
||
(Lance Norskog, Erik Hatcher, Pulkit Singhal, Ahmet Arslan, Luca Cavanna,
|
||
Martijn van Groningen)
|
||
|
||
* SOLR-3190: Minor improvements to SolrEntityProcessor. Add more consistency
|
||
between solr parameters and parameters used in SolrEntityProcessor and
|
||
ability to specify a custom HttpClient instance.
|
||
(Luca Cavanna via Martijn van Groningen)
|
||
|
||
* SOLR-2382: Added pluggable cache support to DIH so that any Entity can be
|
||
made cache-able by adding the "cacheImpl" parameter. Include
|
||
"SortedMapBackedCache" to provide in-memory caching (as previously this was
|
||
the only option when using CachedSqlEntityProcessor). Users can provide
|
||
their own implementations of DIHCache for other caching strategies.
|
||
Deprecate CachedSqlEntityProcessor in favor of specifing "cacheImpl" with
|
||
SqlEntityProcessor. Make SolrWriter implement DIHWriter and allow the
|
||
possibility of pluggable Writers (DIH writing to something other than Solr).
|
||
(James Dyer, Noble Paul)
|
||
|
||
|
||
Optimizations
|
||
----------------------
|
||
* SOLR-1931: Speedup for LukeRequestHandler and admin/schema browser. New parameter
|
||
reportDocCount defaults to 'false'. Old behavior still possible by specifying this as 'true'
|
||
(Erick Erickson)
|
||
|
||
* SOLR-3012: Move System.getProperty("type") in postData() to main() and add type argument so that
|
||
the client applications of SimplePostTool can set content type via method argument. (koji)
|
||
|
||
* SOLR-2888: FSTSuggester refactoring: internal storage is now UTF-8,
|
||
external sorting (on disk) prevents OOMs even with large data sets
|
||
(the bottleneck is now FST construction), code cleanups and API cleanups.
|
||
(Dawid Weiss, Robert Muir)
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
* SOLR-3187 SystemInfoHandler leaks filehandles (siren)
|
||
|
||
* LUCENE-3820: Fixed invalid position indexes by reimplementing PatternReplaceCharFilter.
|
||
This change also drops real support for boundary characters -- all input is prebuffered
|
||
for pattern matching. (Dawid Weiss)
|
||
|
||
* SOLR-3068: Fixed NPE in ThreadDumpHandler (siren)
|
||
|
||
* SOLR-2912: Fixed File descriptor leak in ShowFileRequestHandler (Michael Ryan, shalin)
|
||
|
||
* SOLR-2819: Improved speed of parsing hex entities in HTMLStripCharFilter
|
||
(Bernhard Berger, hossman)
|
||
|
||
* SOLR-2509: StringIndexOutOfBoundsException in the spellchecker collate when the term contains
|
||
a hyphen. (Thomas Gambier caught the bug, Steffen Godskesen did the patch, via Erick Erickson)
|
||
|
||
* SOLR-2955: Fixed IllegalStateException when querying with group.sort=score desc in sharded
|
||
environment. (Steffen Elberg Godskesen, Martijn van Groningen)
|
||
|
||
* SOLR-2956: Fixed inconsistencies in the flags (and flag key) reported by
|
||
the LukeRequestHandler (hossman)
|
||
|
||
* SOLR-1730: Made it clearer when a core failed to load as well as better logging when the
|
||
QueryElevationComponent fails to properly initialize (gsingers)
|
||
|
||
* SOLR-1520: QueryElevationComponent now supports non-string ids (gsingers)
|
||
|
||
* SOLR-3024: Fixed JSONTestUtil.matchObj, in previous releases it was not
|
||
respecting the 'delta' arg (David Smiley via hossman)
|
||
|
||
* SOLR-2542: Fixed DIH Context variables which were broken for all scopes other
|
||
then SCOPE_ENTITY (Linbin Chen & Frank Wesemann via hossman)
|
||
|
||
* SOLR-3042: Fixed Maven Jetty plugin configuration.
|
||
(David Smiley via Steve Rowe)
|
||
|
||
* SOLR-2970: CSV ResponseWriter returns fields defined as stored=false in schema (janhoy)
|
||
|
||
* LUCENE-3690, LUCENE-2208, SOLR-882, SOLR-42: Re-implemented
|
||
HTMLStripCharFilter as a JFlex-generated scanner and moved it to
|
||
lucene/contrib/analyzers/common/. See below for a list of bug fixes and
|
||
other changes. To get the same behavior as HTMLStripCharFilter in Solr
|
||
version 3.5 and earlier (including the bugs), use LegacyHTMLStripCharFilter,
|
||
which is the previous implementation.
|
||
|
||
Behavior changes from the previous version:
|
||
|
||
- Known offset bugs are fixed.
|
||
- The "Mark invalid" exceptions reported in SOLR-1283 are no longer
|
||
triggered (the bug is still present in LegacyHTMLStripCharFilter).
|
||
- The character entity "'" is now always properly decoded.
|
||
- More cases of <script> tags are now properly stripped.
|
||
- CDATA sections are now handled properly.
|
||
- Valid tag name characters now include the supplementary Unicode characters
|
||
from Unicode character classes [:ID_Start:] and [:ID_Continue:].
|
||
- Uppercase character entities """, "©", ">", "<", "®",
|
||
and "&" are now recognized and handled as if they were in lowercase.
|
||
- The REPLACEMENT CHARACTER U+FFFD is now used to replace numeric character
|
||
entities for unpaired UTF-16 low and high surrogates (in the range
|
||
[U+D800-U+DFFF]).
|
||
- Properly paired numeric character entities for UTF-16 surrogates are now
|
||
converted to the corresponding code units.
|
||
- Opening tags with unbalanced quotation marks are now properly stripped.
|
||
- Literal "<" and ">" characters in opening tags, regardless of whether they
|
||
appear inside quotation marks, now inhibit recognition (and stripping) of
|
||
the tags. The only exception to this is for values of event-handler
|
||
attributes, e.g. "onClick", "onLoad", "onSelect".
|
||
- A newline '\n' is substituted instead of a space for stripped HTML markup.
|
||
- Nothing is substituted for opening and closing inline tags - they are
|
||
simply removed. The list of inline tags is (case insensitively): <a>,
|
||
<abbr>, <acronym>, <b>, <basefont>, <bdo>, <big>, <cite>, <code>, <dfn>,
|
||
<em>, <font>, <i>, <img>, <input>, <kbd>, <label>, <q>, <s>, <samp>,
|
||
<select>, <small>, <span>, <strike>, <strong>, <sub>, <sup>, <textarea>,
|
||
<tt>, <u>, and <var>.
|
||
- HTMLStripCharFilterFactory now handles HTMLStripCharFilter's "escapedTags"
|
||
feature: opening and closing tags with the given names, including any
|
||
attributes and their values, are left intact in the output.
|
||
(Steve Rowe)
|
||
|
||
* LUCENE-3717: Fixed offset bugs in TrimFilter, WordDelimiterFilter, and
|
||
HyphenatedWordsFilter where they would create invalid offsets in
|
||
some situations, leading to problems in highlighting. (Robert Muir)
|
||
|
||
* SOLR-2280: commitWithin ignored for a delete query (Juan Grande via janhoy)
|
||
|
||
* SOLR-3073: Fixed 'Invalid UUID string' error when having an UUID field as
|
||
the unique key and executing a distributed grouping request. (Devon Krisman, Martijn van Groningen)
|
||
|
||
* SOLR-3084: Fixed initialization error when using
|
||
<queryResponseWriter default="true" ... /> (Bernd Fehling and hossman)
|
||
|
||
* SOLR-3109: Fixed numerous redundant shard requests when using distributed grouping.
|
||
(rblack via Martijn van Groningen)
|
||
|
||
* SOLR-3052: Fixed typo in distributed grouping parameters.
|
||
(Martijn van Groningen, Grant Ingersoll)
|
||
|
||
* SOLR-2909: Add support for ResourceLoaderAware tokenizerFactories in synonym
|
||
filter factories. (Tom Klonikowski, Jun Ohtani via Koji Sekiguchi)
|
||
|
||
* SOLR-3168: ReplicationHandler "numberToKeep" & "maxNumberOfBackups" parameters
|
||
would keep only 1 backup, even if more than 1 was specified (Neil Hooey, James Dyer)
|
||
|
||
* SOLR-3009: hitGrouped.vm isn't shipped with 3.x (ehatcher, janhoy)
|
||
|
||
* SOLR-3195: timeAllowed is ignored for grouping queries
|
||
(Russell Black via Martijn van Groningen)
|
||
|
||
* SOLR-2124: Do not log stack traces for "Service Disabled" / 503 Exceptions (PingRequestHandler, etc)
|
||
(James Dyer, others)
|
||
|
||
* SOLR-3260: DataImportHandler: ScriptTransformer gives better error messages when
|
||
problems arise on initalization (no Script Engine, invalid script, etc). (James Dyer)
|
||
|
||
* SOLR-2959: edismax now respects the magic fields '_val_' and '_query_'
|
||
(Michael Watts, hossman)
|
||
|
||
* SOLR-3074: fix SolrPluginUtils.docListToSolrDocumentList to respect the
|
||
list of fields specified. This fix also deprecates
|
||
DocumentBuilder.loadStoredFields which is not used anywhere in Solr,
|
||
and was fundamentally broken/bizarre.
|
||
(hossman, Ahmet Arslan)
|
||
|
||
* SOLR-2291: fix JSONWriter to respect field list when writing SolrDocuments
|
||
(Ahmet Arslan via hossman)
|
||
|
||
* SOLR-3264: Fix CoreContainer and SolrResourceLoader logging to be more
|
||
clear about when SolrCores are being created, and stop misleading people
|
||
about SolrCore instanceDir's being the "Solr Home Dir" (hossman)
|
||
|
||
* SOLR-3046: Fix whitespace typo in DIH response "Time taken" (hossman)
|
||
|
||
* SOLR-3261: Fix edismax to respect query operators when literal colons
|
||
are used in query string. (Juan Grande via hossman)
|
||
|
||
* SOLR-3226: Fix SignatureUpdateProcessor to no longer ignore non-String
|
||
field values (Spyros Kapnissis, hossman)
|
||
|
||
* SOLR-3200: Fix SignatureUpdateProcessor "all fields" mode to use all
|
||
fields of each document instead of the fields specified by the first
|
||
document indexed (Spyros Kapnissis via hossman)
|
||
|
||
* SOLR-3316: Distributed grouping failed when rows parameter was set to 0 and
|
||
sometimes returned a wrong hit count as matches. (Cody Young, Martijn van Groningen)
|
||
|
||
* SOLR-3107: contrib/langid: When using the LangDetect implementation of
|
||
langid, set the random seed to 0, so that the same document is detected as
|
||
the same language with the same probability every time.
|
||
(Christian Moen via rmuir)
|
||
|
||
* SOLR-2937: Configuring the number of contextual snippets used for
|
||
search results clustering. The hl.snippets parameter is now respected
|
||
by the clustering plugin, can be overridden by carrot.summarySnippets
|
||
if needed (Stanislaw Osinski).
|
||
|
||
* SOLR-2938: Clustering on multiple fields. The carrot.title and
|
||
carrot.snippet can now take comma- or space-separated lists of
|
||
field names to cluster (Stanislaw Osinski).
|
||
|
||
* SOLR-2939: Clustering of multilingual search results. The document's
|
||
language field be passed in the carrot.lang parameter, the carrot.lcmap
|
||
parameter enables mapping of language codes to ISO 639 (Stanislaw Osinski).
|
||
|
||
* SOLR-2940: Passing values for custom Carrot2 fields to Clustering component.
|
||
The custom field mapping are defined using the carrot.custom parameter
|
||
(Stanislaw Osinski).
|
||
|
||
* SOLR-2941: NullPointerException on clustering component initialization
|
||
when schema does not have a unique key field (Stanislaw Osinski).
|
||
|
||
* SOLR-2942: ClassCastException when passing non-textual fields to
|
||
clustering component (Stanislaw Osinski).
|
||
|
||
|
||
Other Changes
|
||
----------------------
|
||
* SOLR-2922: Upgrade commons-io and commons-lang to 2.1 and 2.6, respectively. (koji)
|
||
|
||
* SOLR-2920: Refactor frequent conditional use of DefaultSolrParams and
|
||
AppendedSolrParams into factory methods.
|
||
(David Smiley via hossman)
|
||
|
||
* SOLR-3032: Deprecate logOnce from SolrException logOnce and all the supporting
|
||
structure will disappear in 4.0. Errors should be caught and logged at the
|
||
top-most level or logged and NOT propagated up the chain. (Erick Erickson)
|
||
|
||
* SOLR-2718: Add ability to lazy load response writers, defined with startup="lazy".
|
||
(ehatcher)
|
||
|
||
* SOLR-2901: Upgrade Solr to Tika 1.0 (janhoy)
|
||
|
||
* SOLR-3059: Example XSL stylesheet for indexing query result XML (janhoy)
|
||
|
||
* SOLR-3097, SOLR-3105: Add analysis configurations for different languages to
|
||
the example. (Christian Moen, Robert Muir)
|
||
|
||
* SOLR-3005: Default QueryResponseWriters are now initialized via init() with an empty
|
||
NamedList. (Gasol Wu, Chris Male)
|
||
|
||
* SOLR-3140: Upgrade schema version to 1.5, where omitNorms defaults to "true" for all
|
||
primitive (non-analyzed) field types such as int, float, date, bool, string.. (janhoy)
|
||
|
||
* SOLR-3077: Better error messages when attempting to use "blank" field names
|
||
(Antony Stubbs via hossman)
|
||
|
||
* SOLR-2712: expecting fl=score to return all fields is now deprecated.
|
||
In solr 4.0, this will only return the score. (ryan)
|
||
|
||
* SOLR-3156: Check for Lucene directory locks at startup. In previous versions
|
||
this check was only performed during modifying (e.g. adding and deleting
|
||
documents) the index. (Luca Cavanna via Martijn van Groningen)
|
||
|
||
* SOLR-1052: Deprecated <indexDefaults> and <mainIndex> in solrconfig.xml
|
||
From now, all settings go in the new <indexConfig> tag, and some defaults are
|
||
changed: useCompoundFile=false, ramBufferSizeMB=32, lockType=native, so that
|
||
the effect of NOT specifying <indexConfig> at all gives same result as the
|
||
example config used to give in 3.5 (janhoy, gsingers)
|
||
|
||
* SOLR-3294: In contrib/clustering/lib/, replaced the manually retrowoven
|
||
Java 1.5-compatible carrot2-core-3.5.0.jar (which is not publicly available,
|
||
except from the Solr Subversion repository), with newly released Java
|
||
1.5-compatible carrot2-core-3.5.0.1.jar (hosted on the Maven Central
|
||
repository). Also updated dependencies jackson-core-asl and
|
||
jackson-mapper-asl (both v1.5.2 -> v1.7.4). (Dawid Weiss, Steve Rowe)
|
||
|
||
* SOLR-3295: netcdf jar is excluded from the binary release (and disabled in
|
||
ivy.xml) because it requires java 6. If you want to parse this content with
|
||
extracting request handler and are willing to use java 6, just add the jar.
|
||
(rmuir)
|
||
|
||
* SOLR-3142: DIH Imports no longer default optimize to true, instead false.
|
||
If you want to force all segments to be merged into one, you can specify
|
||
this parameter yourself. NOTE: this can be very expensive operation and
|
||
usually does not make sense for delta-imports. (Robert Muir)
|
||
|
||
Build
|
||
----------------------
|
||
* SOLR-2487: Add build target to package war without slf4j jars (janhoy)
|
||
|
||
* SOLR-3112: Fix tests not to write to src/test-files (Luca Cavanna via Robert Muir)
|
||
|
||
* LUCENE-3753: Restructure the Solr build system. (Steve Rowe)
|
||
|
||
* SOLR-3204: The packaged pre-release artifact of Commons CSV used the original
|
||
package name (org.apache.commons.csv). This created a compatibility issue as
|
||
the Apache Commons team works toward an official release of Commons CSV.
|
||
The source of Commons CSV was added under a separate package name to the
|
||
Solr source code. (Uwe Schindler, Chris Male, Emmanuel Bourg)
|
||
|
||
* LUCENE-3930: Changed build system to use Apache Ivy for retrival of 3rd
|
||
party JAR files. Please review README.txt for instructions.
|
||
(Robert Muir, Chris Male, Uwe Schindler, Steven Rowe, Hossman)
|
||
|
||
================== 3.5.0 ==================
|
||
|
||
New Features
|
||
----------------------
|
||
* SOLR-2749: Add boundary scanners for FastVectorHighlighter. <boundaryScanner/>
|
||
can be specified with a name in solrconfig.xml, and use hl.boundaryScanner=name
|
||
parameter to specify the named <boundaryScanner/>. (koji)
|
||
|
||
* SOLR-2066,SOLR-2776: Added support for distributed grouping.
|
||
(Martijn van Groningen, Jasper van Veghel, Matt Beaumont)
|
||
|
||
* SOLR-2769: Added factory for the new Hunspell stemmer capable of doing stemming
|
||
for 99 languages (janhoy, cmale)
|
||
|
||
* SOLR-1979: New contrib "langid". Adds language identification capabilities as an
|
||
Update Processor, using Tika's LanguageIdentifier or Cybozu language-detection
|
||
library (janhoy, Tommaso Teofili, gsingers)
|
||
|
||
* SOLR-2818: Added before/after count response parsing support for range facets in
|
||
SolrJ. (Bernhard Frauendienst via Martijn van Groningen)
|
||
|
||
* SOLR-2276: Add support for cologne phonetic to PhoneticFilterFactory.
|
||
(Marc Pompl via rmuir)
|
||
|
||
* SOLR-1926: Add hl.q parameter. (koji)
|
||
|
||
* SOLR-2881: Numeric types now support sortMissingFirst/Last. This includes Trie and date types
|
||
(Ryan McKinley, Mike McCandless, Uwe Schindler, Erick Erickson)
|
||
|
||
* SOLR-1023: StatsComponent now supports date fields and string fields.
|
||
(Chris Male, Mark Holland, Gunnlaugur Thor Briem, Ryan McKinley)
|
||
|
||
* SOLR-2578: ReplicationHandler's backup command now supports a 'numberToKeep'
|
||
request param that can be used to delete all but the most recent N backups.
|
||
(James Dyer via hossman)
|
||
|
||
* SOLR-2839: Add alternative implementation to contrib/langid supporting 53
|
||
languages, based on http://code.google.com/p/language-detection/ (rmuir)
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* SOLR-2742: SolrJ: Provide commitWithinMs as optional parameter for all add() methods,
|
||
making the feature more conveniently accessible for developers (janhoy)
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
* SOLR-2748: The CommitTracker used for commitWith or autoCommit by maxTime
|
||
could commit too frequently and could block adds until a new searcher was
|
||
registered. (yonik)
|
||
|
||
* SOLR-2726: Fixed NullPointerException when using spellcheck.q with Suggester.
|
||
(Bernd Fehling, valentin via rmuir)
|
||
|
||
* SOLR-2772: Fixed Date parsing/formatting of years 0001-1000 (hossman)
|
||
|
||
* SOLR-2763: Extracting update request handler throws exception and returns 400
|
||
when zero-length file posted using multipart form post (janhoy)
|
||
|
||
* SOLR-2780: Fixed issue where multi select facets didn't respect group.truncate parameter.
|
||
(Martijn van Groningen, Ramzi Alqrainy)
|
||
|
||
* SOLR-2793: In rare cases (most likely during shutdown), a SolrIndexSearcher can be left
|
||
open if the executor rejects a task. (Mark Miller)
|
||
|
||
* SOLR-2791: Replication: abortfetch command is broken if replication was started
|
||
by fetchindex command instead of a regular poll (Yury Kats via shalin)
|
||
|
||
* SOLR-2861: Fix extremely rare race condition on commit that can result
|
||
in a NPE (yonik)
|
||
|
||
* SOLR-2813: Fix HTTP error codes returned when requests contain strings that
|
||
can not be parsed as numbers for Trie fields. (Jeff Crump and hossman)
|
||
|
||
* SOLR-2902: List of collations are wrong parsed in SpellCheckResponse causing
|
||
a wrong number of collation results in the response.
|
||
(Bastiaan Verhoef, James Dyer via Simon Willnauer)
|
||
|
||
* SOLR-2875: Fix the incorrect url in DIH example tika-data-config.xml
|
||
(Shinichiro Abe via koji)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-2750: Make both "update.chain" and the deprecated "update.param" work
|
||
consistently everywhere; see also SOLR-2105. (Mark Miller, janhoy)
|
||
|
||
* LUCENE-3410: Deprecated the WordDelimiterFilter constructors accepting multiple
|
||
ints masquerading as booleans. Preferred constructor now accepts a single int
|
||
bitfield (Chris Male)
|
||
|
||
* SOLR-2758: Moved ConcurrentLRUCache from o.a.s.common.util package in the solrj
|
||
module to the o.a.s.util package in the Solr core module.
|
||
(David Smiley via Steve Rowe)
|
||
|
||
* SOLR-2766: Package individual javadoc sites for solrj and test-framework.
|
||
(Steve Rowe, Mike McCandless)
|
||
|
||
* SOLR-2771: Solr modules' tests should not depend on solr-core test classes;
|
||
move BufferingRequestProcessor from solr-core tests to test-framework so that
|
||
the Solr Cell module can use it. (janhoy, Steve Rowe)
|
||
|
||
* LUCENE-3457: Upgrade commons-compress to 1.2 (Doron Cohen)
|
||
|
||
* SOLR-2757: min() and max() functions now support an arbitrary number of
|
||
ValueSources (Bill Bell via hossman)
|
||
|
||
* SOLR-2372: Upgrade Solr to Tika 0.10 (janhoy)
|
||
|
||
* SOLR-2792: Allow case insensitive Hunspell stemming (janhoy, rmuir)
|
||
|
||
* SOLR-2862: More explicit lexical resources location logged if Carrot2 clustering
|
||
extension is used. Fixed solr. impl. of IResource and IResourceLookup. (Dawid Weiss)
|
||
|
||
* SOLR-2849: Fix dependencies in Maven POMs. (David Smiley via Steve Rowe)
|
||
|
||
* SOLR-2591: Remove commitLockTimeout option from solrconfig.xml (Luca Cavanna via Martijn van Groningen)
|
||
|
||
* SOLR-2746: Upgraded UIMA dependencies from *-2.3.1-SNAPSHOT.jar to *-2.3.1.jar.
|
||
|
||
|
||
================== 3.4.0 ==================
|
||
|
||
Upgrading from Solr 3.3
|
||
----------------------
|
||
|
||
* The Lucene index format has changed and as a result, once you upgrade,
|
||
previous versions of Solr will no longer be able to read your indices.
|
||
In a master/slave configuration, all searchers/slaves should be upgraded
|
||
before the master. If the master were to be updated first, the older
|
||
searchers would not be able to read the new index format.
|
||
|
||
* Previous versions of Solr silently allow and ignore some contradictory
|
||
properties specified in schema.xml. For example:
|
||
- indexed="false" omitNorms="false"
|
||
- indexed="false" omitTermFreqAndPositions="false"
|
||
Field property validation has now been fixed, to ensure that
|
||
contradictions like these now generate error messages. If users
|
||
have existing schemas that generate one of these new "conflicting
|
||
'false' field options for non-indexed field" error messages the
|
||
conflicting "omit*" properties can safely be removed, or changed to
|
||
"true" for consistent behavior with previous Solr versions. This
|
||
situation has now been fixed to cause an error on startup when these
|
||
contradictory options. See SOLR-2669.
|
||
|
||
* FacetComponent no longer catches and embeds exceptions occurred during facet
|
||
processing, it throws HTTP 400 or 500 exceptions instead.
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* SOLR-2540: CommitWithin as an Update Request parameter
|
||
You can now specify &commitWithin=N (ms) on the update request (janhoy)
|
||
|
||
* SOLR-2458: post.jar enhanced to handle JSON, CSV and <optimize> (janhoy)
|
||
|
||
* LUCENE-3234: add a new parameter hl.phraseLimit for FastVectorHighlighter speed up.
|
||
(Mike Sokolov via koji)
|
||
|
||
* SOLR-2429: Ability to add cache=false to queries and query filters to avoid
|
||
using the filterCache or queryCache. A cost may also be specified and is used
|
||
to order the evaluation of non-cached filters from least to greatest cost .
|
||
For very expensive query filters (cost >= 100) if the query implements
|
||
the PostFilter interface, it will be used to obtain a Collector that is
|
||
checked only for documents that match the main query and all other filters.
|
||
The "frange" query now implements the PostFilter interface. (yonik)
|
||
|
||
* SOLR-2630: Added new XsltUpdateRequestHandler that works like
|
||
XmlUpdateRequestHandler but allows to transform the POSTed XML document
|
||
using XSLT. This allows to POST arbitrary XML documents to the update
|
||
handler, as long as you also provide a XSL to transform them to a valid
|
||
Solr input document. (Upayavira, Uwe Schindler)
|
||
|
||
* SOLR-2615: Log individual updates (adds and deletes) at the FINE level
|
||
before adding to the index. Fix a null pointer exception in logging
|
||
when there was no unique key. (David Smiley via yonik)
|
||
|
||
* LUCENE-2048: Added omitPositions to the schema, so you can omit position
|
||
information while still indexing term frequencies. (rmuir)
|
||
|
||
* SOLR-2584: add UniqFieldsUpdateProcessor that removes duplicate values in the
|
||
specified fields. (Elmer Garduno, koji)
|
||
|
||
* SOLR-2670: Added NIOFSDirectoryFactory (yonik)
|
||
|
||
* SOLR-2523: Added support in SolrJ to easily interact with range facets.
|
||
The range facet response can be parsed and is retrievable from the
|
||
QueryResponse class. The SolrQuery class has convenient methods for using
|
||
range facets. (Martijn van Groningen)
|
||
|
||
* SOLR-2637: Added support for group result parsing in SolrJ.
|
||
(Tao Cheng, Martijn van Groningen)
|
||
|
||
* SOLR-2665: Added post group faceting. Facet counts are based on the most
|
||
relevant document of each group matching the query. This feature has the
|
||
same impact on the StatsComponent. (Martijn van Groningen)
|
||
|
||
* SOLR-2675: CoreAdminHandler now allows arbitrary properties to be
|
||
specified when CREATEing a new SolrCore using property.* request
|
||
params. (Yury Kats, hossman)
|
||
|
||
* SOLR-2714: JSON update format - "null" field values are now dropped
|
||
instead of causing an exception. (Trygve Laugstøl, yonik)
|
||
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* LUCENE-3233: Improved memory usage, build time, and performance of
|
||
SynonymFilterFactory. (Mike McCandless, Robert Muir)
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-2625: TermVectorComponent throws NPE if TF-IDF option is used without DF
|
||
option. (Daniel Erenrich, Simon Willnauer)
|
||
|
||
* SOLR-2631: PingRequestHandler should not allow to ping itself using "qt"
|
||
param to prevent infinite loop. (Edoardo Tosca, Uwe Schindler)
|
||
|
||
* SOLR-2636: Fix explain functionality for negative queries. (Tom Hill via yonik)
|
||
|
||
* SOLR-2538: Range Faceting on long/double fields could overflow if values
|
||
bigger then the max int/float were used.
|
||
(Erbi Hanka, hossman)
|
||
|
||
* SOLR-2230: CommonsHttpSolrServer.addFile could not be used to send
|
||
multiple files in a single request.
|
||
(Stephan Günther, hossman)
|
||
|
||
* SOLR-2541: PluginInfos was not correctly parsing <long/> tags when
|
||
initializing plugins
|
||
(Frank Wesemann, hossman)
|
||
|
||
* SOLR-2623: Solr JMX MBeans do not survive core reloads (Alexey Serba, shalin)
|
||
|
||
* Fixed grouping bug when start is bigger than rows and format is simple that zero documents are returned even
|
||
if there are documents to display. (Martijn van Groningen, Nikhil Chhaochharia)
|
||
|
||
* SOLR-2564: Fixed ArrayIndexOutOfBoundsException when using simple format and
|
||
start > 0 (Martijn van Groningen, Matteo Melli)
|
||
|
||
* SOLR-2642: Fixed sorting by function when using grouping. (Thomas Heigl, Martijn van Groningen)
|
||
|
||
* SOLR-2535: REGRESSION: in Solr 3.x and trunk the admin/file handler
|
||
fails to show directory listings (David Smiley, Peter Wolanin via Erick Erickson)
|
||
|
||
* SOLR-2545: ExternalFileField file parsing would fail if any key
|
||
contained an "=" character. It now only looks for the last "=" delimiter
|
||
prior to the float value.
|
||
(Markus Jelsma, hossman)
|
||
|
||
* SOLR-2662: When Solr is configured to have no queryResultCache, the
|
||
"start" parameter was not honored and the documents returned were
|
||
0 through start+offset. (Markus Jelsma, yonik)
|
||
|
||
* SOLR-2669: Fix backwards validation of field properties in
|
||
SchemaField.calcProps (hossman)
|
||
|
||
* SOLR-2676: Add "welcome-file-list" to solr.war so admin UI works correctly
|
||
in servlet containers such as WebSphere that do not use a default list
|
||
(Jay R. Jaeger, hossman)
|
||
|
||
* SOLR-2606: Fixed sort parsing of fields containing punctuation that
|
||
failed due to sort by function changes introduced in SOLR-1297
|
||
(Mitsu Hadeishi, hossman)
|
||
|
||
* SOLR-2706: contrib/clustering: The carrot.lexicalResourcesDir parameter
|
||
now works with absolute directories (Stanislaw Osinski)
|
||
|
||
* SOLR-2692: contrib/clustering: Typo in param name fixed: "carrot.fragzise"
|
||
changed to "carrot.fragSize" (Stanislaw Osinski).
|
||
|
||
* SOLR-2644: When using DIH with threads=2 the default logging is set too high
|
||
(Bill Bell via shalin)
|
||
|
||
* SOLR-2492: DIH does not commit if only deletes are processed
|
||
(James Dyer via shalin)
|
||
|
||
* SOLR-2186: DataImportHandler's multi-threaded option throws NPE
|
||
(Lance Norskog, Frank Wesemann, shalin)
|
||
|
||
* SOLR-2655: DIH multi threaded mode does not resolve attributes correctly
|
||
(Frank Wesemann, shalin)
|
||
|
||
* SOLR-2695: DIH: Documents are collected in unsynchronized list in
|
||
multi-threaded debug mode (Michael McCandless, shalin)
|
||
|
||
* SOLR-2668: DIH multithreaded mode does not rollback on errors from
|
||
EntityProcessor (Frank Wesemann, shalin)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-2629: Eliminate deprecation warnings in some JSPs.
|
||
(Bernd Fehling, hossman)
|
||
|
||
* SOLR-2743: Remove commons logging from contrib/extraction. (koji)
|
||
|
||
|
||
Build
|
||
----------------------
|
||
|
||
* SOLR-2452,SOLR-2653,LUCENE-3323,SOLR-2659,LUCENE-3329,SOLR-2666:
|
||
Rewrote the Solr build system:
|
||
- Integrated more fully with the Lucene build system: generalized the
|
||
Lucene build system and eliminated duplication.
|
||
- Converted all Solr contribs to the Lucene/Solr conventional src/ layout:
|
||
java/, resources/, test/, and test-files/<contrib-name>.
|
||
- Created a new Solr-internal module named "core" by moving the java/,
|
||
test/, and test-files/ directories from solr/src/ to solr/core/src/.
|
||
- Merged solr/src/webapp/src/ into solr/core/src/java/.
|
||
- Eliminated solr/src/ by moving all its directories up one level;
|
||
renamed solr/src/site/ to solr/site-src/ because solr/site/ already
|
||
exists.
|
||
- Merged solr/src/common/ into solr/solrj/src/java/.
|
||
- Moved o.a.s.client.solrj.* and o.a.s.common.* tests from
|
||
solr/src/test/ to solr/solrj/src/test/.
|
||
- Made the solrj tests not depend on the solr core tests by moving
|
||
some classes from solr/src/test/ to solr/test-framework/src/java/.
|
||
- Each internal module (core/, solrj/, test-framework/, and webapp/)
|
||
now has its own build.xml, from which it is possible to run
|
||
module-specific targets. solr/build.xml delegates all build
|
||
tasks (via <ant dir="internal-module-dir"> calls) to these
|
||
modules' build.xml files.
|
||
(Steve Rowe, Robert Muir)
|
||
|
||
* LUCENE-3406: Add ant target 'package-local-src-tgz' to Lucene and Solr
|
||
to package sources from the local working copy.
|
||
(Seung-Yeoul Yang via Steve Rowe)
|
||
|
||
Documentation
|
||
----------------------
|
||
|
||
================== 3.3.0 ==================
|
||
|
||
Upgrading from Solr 3.2.0
|
||
----------------------
|
||
* SolrCore's CloseHook API has been changed in a backward-incompatible way. It
|
||
has been changed from an interface to an abstract class. Any custom
|
||
components which use the SolrCore.addCloseHook method will need to
|
||
be modified accordingly. To migrate, put your old CloseHook#close impl into
|
||
CloseHook#preClose.
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* SOLR-2378: A new, automaton-based, implementation of suggest (autocomplete)
|
||
component, offering an order of magnitude smaller memory consumption
|
||
compared to ternary trees and jaspell and very fast lookups at runtime.
|
||
(Dawid Weiss)
|
||
|
||
* SOLR-2400: Field- and DocumentAnalysisRequestHandler now provide a position
|
||
history for each token, so you can follow the token through all analysis stages.
|
||
The output contains a separate int[] attribute containing all positions from
|
||
previous Tokenizers/TokenFilters (called "positionHistory").
|
||
(Uwe Schindler)
|
||
|
||
* SOLR-2524: (SOLR-236, SOLR-237, SOLR-1773, SOLR-1311) Grouping / Field collapsing
|
||
using the Lucene grouping contrib. The search result can be grouped by field and query.
|
||
(Martijn van Groningen, Emmanuel Keller, Shalin Shekhar Mangar, Koji Sekiguchi,
|
||
Iván de Prado, Ryan McKinley, Marc Sturlese, Peter Karich, Bojan Smid,
|
||
Charles Hornberger, Dieter Grad, Dmitry Lihachev, Doug Steigerwald,
|
||
Karsten Sperling, Michael Gundlach, Oleg Gnatovskiy, Thomas Traeger,
|
||
Harish Agarwal, yonik, Michael McCandless, Bill Bell)
|
||
|
||
* SOLR-1331 -- Added a srcCore parameter to CoreAdminHandler's mergeindexes action
|
||
to merge one or more cores' indexes to a target core (shalin)
|
||
|
||
* SOLR-2610 -- Add an option to delete index through CoreAdmin UNLOAD action (shalin)
|
||
|
||
* SOLR-2480: Add ignoreTikaException flag to the extraction request handler so
|
||
that users can ignore TikaException but index meta data.
|
||
(Shinichiro Abe, koji)
|
||
|
||
* SOLR-2582: Use uniqueKey for error log in UIMAUpdateRequestProcessor.
|
||
(Tommaso Teofili via koji)
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* SOLR-2567: Solr now defaults to TieredMergePolicy. See http://s.apache.org/merging
|
||
for more information. (rmuir)
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-2519: Improve text_* fieldTypes in example schema.xml: improve
|
||
cross-language defaults for text_general; break out separate
|
||
English-specific fieldTypes (Jan Høydahl, hossman, Robert Muir,
|
||
yonik, Mike McCandless)
|
||
|
||
* SOLR-2462: Fix extremely high memory usage problems with spellcheck.collate.
|
||
Separately, an additional spellcheck.maxCollationEvaluations (default=10000)
|
||
parameter is added to avoid excessive CPU time in extreme cases (e.g. long
|
||
queries with many misspelled words). (James Dyer via rmuir)
|
||
|
||
* SOLR-2579: UIMAUpdateRequestProcessor ignore error fails if text.length() < 100.
|
||
(Elmer Garduno via koji)
|
||
|
||
* SOLR-2581: UIMAToSolrMapper wrongly instantiates Type with reflection.
|
||
(Tommaso Teofili via koji)
|
||
|
||
* SOLR-2551: Check dataimport.properties for write access (if delta-import is
|
||
supported in DIH configuration) before starting an import (C S, shalin)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-2571: Add a commented out example of the spellchecker's thresholdTokenFrequency
|
||
parameter to the example solrconfig.xml, and also add a unit test for this feature.
|
||
(James Dyer via rmuir)
|
||
|
||
* SOLR-2576: Deprecate SpellingResult.add(Token token, int docFreq), please use
|
||
SpellingResult.addFrequency(Token token, int docFreq) instead.
|
||
(James Dyer via rmuir)
|
||
|
||
* SOLR-2574: Upgrade slf4j to v1.6.1 (shalin)
|
||
|
||
* LUCENE-3204: The maven-ant-tasks jar is now included in the source tree;
|
||
users of the generate-maven-artifacts target no longer have to manually
|
||
place this jar in the Ant classpath. NOTE: when Ant looks for the
|
||
maven-ant-tasks jar, it looks first in its pre-existing classpath, so
|
||
any copies it finds will be used instead of the copy included in the
|
||
Lucene/Solr source tree. For this reason, it is recommeded to remove
|
||
any copies of the maven-ant-tasks jar in the Ant classpath, e.g. under
|
||
~/.ant/lib/ or under the Ant installation's lib/ directory. (Steve Rowe)
|
||
|
||
* SOLR-2611: Fix typos in the example configuration (Eric Pugh via rmuir)
|
||
|
||
================== 3.2.0 ==================
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Lucene trunk
|
||
Apache Tika 0.8
|
||
Carrot2 3.4.2
|
||
|
||
|
||
Upgrading from Solr 3.1
|
||
----------------------
|
||
|
||
* The updateRequestProcessorChain for a RequestHandler is now defined
|
||
with update.chain rather than update.processor. The latter still works,
|
||
but has been deprecated.
|
||
|
||
* <uimaConfig/> just beneath <config> ... </config> is no longer supported.
|
||
It should move to UIMAUpdateRequestProcessorFactory setting.
|
||
See contrib/uima/README.txt for more details. (SOLR-2436)
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* SOLR-2496: Add ability to specify overwrite and commitWithin as request
|
||
parameters (e.g. specified in the URL) when using the JSON update format,
|
||
and added a simplified format for specifying multiple documents.
|
||
Example: [{"id":"doc1"},{"id":"doc2"}]
|
||
(yonik)
|
||
|
||
* SOLR-2113: Add TermQParserPlugin, registered as "term". This is useful
|
||
when generating filter queries from terms returned from field faceting or
|
||
the terms component. Example: fq={!term f=weight}1.5 (hossman, yonik)
|
||
|
||
* SOLR-1915: DebugComponent now supports using a NamedList to model
|
||
Explanation objects in it's responses instead of
|
||
Explanation.toString (hossman)
|
||
|
||
* SOLR-2448: Search results clustering updates: bisecting k-means
|
||
clustering algorithm added, loading of Carrot2 stop words from
|
||
<solr.home>/conf/carrot2 (SOLR-2449), using Solr's stopwords.txt
|
||
for clustering (SOLR-2450), output of cluster scores (SOLR-2505)
|
||
(Stanislaw Osinski, Dawid Weiss).
|
||
|
||
* SOLR-2503: extend UIMAUpdateRequestProcessorFactory mapping function to
|
||
map feature value to dynamicField. (koji)
|
||
|
||
* SOLR-2512: add ignoreErrors flag to UIMAUpdateRequestProcessorFactory so
|
||
that users can ignore exceptions in AE. (Tommaso Teofili, koji)
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
|
||
* SOLR-2445: Change the default qt to blank in form.jsp, because there is no "standard"
|
||
request handler unless you have it in your solrconfig.xml explicitly. (koji)
|
||
|
||
* SOLR-2455: Prevent double submit of forms in admin interface.
|
||
(Jeffrey Chang via uschindler)
|
||
|
||
* SOLR-2464: Fix potential slowness in QueryValueSource (the query() function) when
|
||
the query is very sparse and may not match any documents in a segment. (yonik)
|
||
|
||
* SOLR-2469: When using java replication with replicateAfter=startup, the first
|
||
commit point on server startup is never removed. (yonik)
|
||
|
||
* SOLR-2466: SolrJ's CommonsHttpSolrServer would retry requests on failure, regardless
|
||
of the configured maxRetries, due to HttpClient having it's own retry mechanism
|
||
by default. The retryCount of HttpClient is now set to 0, and SolrJ does
|
||
the retry. (yonik)
|
||
|
||
* SOLR-2409: edismax parser - treat the text of a fielded query as a literal if the
|
||
fieldname does not exist. For example Mission: Impossible should not search on
|
||
the "Mission" field unless it's a valid field in the schema. (Ryan McKinley, yonik)
|
||
|
||
* SOLR-2403: facet.sort=index reported incorrect results for distributed search
|
||
in a number of scenarios when facet.mincount>0. This patch also adds some
|
||
performance/algorithmic improvements when (facet.sort=count && facet.mincount=1
|
||
&& facet.limit=-1) and when (facet.sort=index && facet.mincount>0) (yonik)
|
||
|
||
* SOLR-2333: The "rename" core admin action does not persist the new name to solr.xml
|
||
(Rasmus Hahn, Paul R. Brown via Mark Miller)
|
||
|
||
* SOLR-2390: Performance of usePhraseHighlighter is terrible on very large Documents,
|
||
regardless of hl.maxDocCharsToAnalyze. (Mark Miller)
|
||
|
||
* SOLR-2474: The helper TokenStreams in analysis.jsp and AnalysisRequestHandlerBase
|
||
did not clear all attributes so they displayed incorrect attribute values for tokens
|
||
in later filter stages. (uschindler, rmuir, yonik)
|
||
|
||
* SOLR-2467: Fix <analyzer class="..." /> initialization so any errors
|
||
are logged properly. (hossman)
|
||
|
||
* SOLR-2493: SolrQueryParser was fixed to not parse the SolrConfig DOM tree on each
|
||
instantiation which is a huge slowdown. (Stephane Bailliez via uschindler)
|
||
|
||
* SOLR-2495: The JSON parser could hang on corrupted input and could fail
|
||
to detect numbers that were too large to fit in a long. (yonik)
|
||
|
||
* SOLR-2520: Make JSON response format escape \u2029 as well as \u2028
|
||
in strings since those characters are not valid in javascript strings
|
||
(although they are valid in JSON strings). (yonik)
|
||
|
||
* SOLR-2536: Add ReloadCacheRequestHandler to fix ExternalFileField bug (if reopenReaders
|
||
set to true and no index segments have been changed, commit cannot trigger reload
|
||
external file). (koji)
|
||
|
||
* SOLR-2539: VectorValueSource.floatVal incorrectly used byteVal on sub-sources.
|
||
(Tom Liu via yonik)
|
||
|
||
* SOLR-2554: RandomSortField didn't work when used in a function query. (yonik)
|
||
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-2061: Pull base tests out into a new Solr Test Framework module,
|
||
and publish binary, javadoc, and source test-framework jars.
|
||
(Drew Farris, Robert Muir, Steve Rowe)
|
||
|
||
* SOLR-2105: Rename RequestHandler param 'update.processor' to 'update.chain'.
|
||
(Jan Høydahl via Mark Miller)
|
||
|
||
* SOLR-2485: Deprecate BaseResponseWriter, GenericBinaryResponseWriter, and
|
||
GenericTextResponseWriter. These classes will be removed in 4.0. (ryan)
|
||
|
||
* SOLR-2451: Enhance assertJQ to allow individual tests to specify the
|
||
tolerance delta used in numeric equalities. This allows for slight
|
||
variance in asserting score comparisons in unit tests.
|
||
(David Smiley, Chris Hostetter)
|
||
|
||
* SOLR-2528: Remove default="true" from HtmlEncoder in example solrconfig.xml,
|
||
because html encoding confuses non-ascii users. (koji)
|
||
|
||
* SOLR-2387: add mock annotators for improved testing in contrib/uima,
|
||
(Tommaso Teofili via rmuir)
|
||
|
||
* SOLR-2436: move uimaConfig to under the uima's update processor in
|
||
solrconfig.xml. (Tommaso Teofili, koji)
|
||
|
||
Build
|
||
----------------------
|
||
|
||
* LUCENE-3006: Building javadocs will fail on warnings by default. Override with -Dfailonjavadocwarning=false (sarowe, gsingers)
|
||
|
||
|
||
Documentation
|
||
----------------------
|
||
|
||
|
||
================== 3.1.0 ==================
|
||
Versions of Major Components
|
||
---------------------
|
||
Apache Lucene 3.1.0
|
||
Apache Tika 0.8
|
||
Carrot2 3.4.2
|
||
Velocity 1.6.1 and Velocity Tools 2.0-beta3
|
||
Apache UIMA 2.3.1-SNAPSHOT
|
||
|
||
|
||
Upgrading from Solr 1.4
|
||
----------------------
|
||
|
||
* The Lucene index format has changed and as a result, once you upgrade,
|
||
previous versions of Solr will no longer be able to read your indices.
|
||
In a master/slave configuration, all searchers/slaves should be upgraded
|
||
before the master. If the master were to be updated first, the older
|
||
searchers would not be able to read the new index format.
|
||
|
||
* The Solr JavaBin format has changed as of Solr 3.1. If you are using the
|
||
JavaBin format, you will need to upgrade your SolrJ client. (SOLR-2034)
|
||
|
||
* The experimental ALIAS command has been removed (SOLR-1637)
|
||
|
||
* Using solr.xml is recommended for single cores also (SOLR-1621)
|
||
|
||
* Old syntax of <highlighting> configuration in solrconfig.xml
|
||
is deprecated (SOLR-1696)
|
||
|
||
* The deprecated HTMLStripReader, HTMLStripWhitespaceTokenizerFactory and
|
||
HTMLStripStandardTokenizerFactory were removed. To strip HTML tags,
|
||
HTMLStripCharFilter should be used instead, and it works with any
|
||
Tokenizer of your choice. (SOLR-1657)
|
||
|
||
* Field compression is no longer supported. Fields that were formerly
|
||
compressed will be uncompressed as index segments are merged. For
|
||
shorter fields, this may actually be an improvement, as the compression
|
||
used was not very good for short text. Some indexes may get larger though.
|
||
|
||
* SOLR-1845: The TermsComponent response format was changed so that the
|
||
"terms" container is a map instead of a named list. This affects
|
||
response formats like JSON, but not XML. (yonik)
|
||
|
||
* SOLR-1876: All Analyzers and TokenStreams are now final to enforce
|
||
the decorator pattern. (rmuir, uschindler)
|
||
|
||
* LUCENE-2608: Added the ability to specify the accuracy on a per request basis.
|
||
It is recommended that implementations of SolrSpellChecker should change over to the new SolrSpellChecker
|
||
methods using the new SpellingOptions class, but are not required to. While this change is
|
||
backward compatible, the trunk version of Solr has already dropped support for all but the SpellingOptions method. (gsingers)
|
||
|
||
* readercycle script was removed. (SOLR-2046)
|
||
|
||
* In previous releases, sorting or evaluating function queries on
|
||
fields that were "multiValued" (either by explicit declaration in
|
||
schema.xml or by implict behavior because the "version" attribute on
|
||
the schema was less then 1.2) did not generally work, but it would
|
||
sometimes silently act as if it succeeded and order the docs
|
||
arbitrarily. Solr will now fail on any attempt to sort, or apply a
|
||
function to, multi-valued fields
|
||
|
||
* The DataImportHandler jars are no longer included in the solr
|
||
WAR and should be added in Solr's lib directory, or referenced
|
||
via the <lib> directive in solrconfig.xml.
|
||
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
New Features
|
||
----------------------
|
||
|
||
* SOLR-1302: Added several new distance based functions, including
|
||
Great Circle (haversine), Manhattan, Euclidean and String (using the
|
||
StringDistance methods in the Lucene spellchecker).
|
||
Also added geohash(), deg() and rad() convenience functions.
|
||
See http://wiki.apache.org/solr/FunctionQuery. (gsingers)
|
||
|
||
* SOLR-1553: New dismax parser implementation (accessible as "edismax")
|
||
that supports full lucene syntax, improved reserved char escaping,
|
||
fielded queries, improved proximity boosting, and improved stopword
|
||
handling. Note: status is experimental for now. (yonik)
|
||
|
||
* SOLR-1574: Add many new functions from java Math (e.g. sin, cos) (yonik)
|
||
|
||
* SOLR-1569: Allow functions to take in literal strings by modifying the
|
||
FunctionQParser and adding LiteralValueSource (gsingers)
|
||
|
||
* SOLR-1571: Added unicode collation support though Lucene's CollationKeyFilter
|
||
(Robert Muir via shalin)
|
||
|
||
* SOLR-785: Distributed Search support for SpellCheckComponent
|
||
(Matthew Woytowitz, shalin)
|
||
|
||
* SOLR-1625: Add regexp support for TermsComponent (Uri Boness via noble)
|
||
|
||
* SOLR-1297: Add sort by Function capability (gsingers, yonik)
|
||
|
||
* SOLR-1139: Add TermsComponent Query and Response Support in SolrJ (Matt Weber via shalin)
|
||
|
||
* SOLR-1177: Distributed Search support for TermsComponent (Matt Weber via shalin)
|
||
|
||
* SOLR-1621, SOLR-1722: Allow current single core deployments to be specified by solr.xml (Mark Miller , noble)
|
||
|
||
* SOLR-1532: Allow StreamingUpdateSolrServer to use a provided HttpClient (Gabriele Renzi via shalin)
|
||
|
||
* SOLR-1653: Add PatternReplaceCharFilter (koji)
|
||
|
||
* SOLR-1131: FieldTypes can now output multiple Fields per Type and still be searched. This can be handy for hiding the details of a particular
|
||
implementation such as in the spatial case. (Chris Mattmann, shalin, noble, gsingers, yonik)
|
||
|
||
* SOLR-1586: Add support for Geohash and Spatial Tile FieldType (Chris Mattmann, gsingers)
|
||
|
||
* SOLR-1697: PluginInfo should load plugins w/o class attribute also (noble)
|
||
|
||
* SOLR-1268: Incorporate FastVectorHighlighter (koji)
|
||
|
||
* SOLR-1750: SolrInfoMBeanHandler added for simpler programmatic access
|
||
to info currently available from registry.jsp and stats.jsp
|
||
(ehatcher, hossman)
|
||
|
||
* SOLR-1815: SolrJ now preserves the order of facet queries. (yonik)
|
||
|
||
* SOLR-1677: Add support for choosing the Lucene Version for Lucene components within
|
||
Solr. (Uwe Schindler, Mark Miller)
|
||
|
||
* SOLR-1379: Add RAMDirectoryFactory for non-persistent in memory index storage.
|
||
(Alex Baranov via yonik)
|
||
|
||
* SOLR-1857: Synced Solr analysis with Lucene 3.1. Added KeywordMarkerFilterFactory
|
||
and StemmerOverrideFilterFactory, which can be used to tune stemming algorithms.
|
||
Added factories for Bulgarian, Czech, Hindi, Turkish, and Wikipedia analysis. Improved the
|
||
performance of SnowballPorterFilterFactory. (rmuir)
|
||
|
||
* SOLR-1657: Converted remaining TokenStreams to the Attributes-based API. All Solr
|
||
TokenFilters now support custom Attributes, and some have improved performance:
|
||
especially WordDelimiterFilter and CommonGramsFilter. (rmuir, cmale, uschindler)
|
||
|
||
* SOLR-1740: ShingleFilterFactory supports the "minShingleSize" and "tokenSeparator"
|
||
parameters for controlling the minimum shingle size produced by the filter, and
|
||
the separator string that it uses, respectively. (Steven Rowe via rmuir)
|
||
|
||
* SOLR-744: ShingleFilterFactory supports the "outputUnigramsIfNoShingles"
|
||
parameter, to output unigrams if the number of input tokens is fewer than
|
||
minShingleSize, and no shingles can be generated.
|
||
(Chris Harris via Steven Rowe)
|
||
|
||
* SOLR-1923: PhoneticFilterFactory now has support for the
|
||
Caverphone algorithm. (rmuir)
|
||
|
||
* SOLR-1957: The VelocityResponseWriter contrib moved to core.
|
||
Example search UI now available at http://localhost:8983/solr/browse
|
||
(ehatcher)
|
||
|
||
* SOLR-1974: Add LimitTokenCountFilterFactory. (koji)
|
||
|
||
* SOLR-1966: QueryElevationComponent can now return just the included results in the elevation file (gsingers, yonik)
|
||
|
||
* SOLR-1556: TermVectorComponent now supports per field overrides. Also, it now throws an error
|
||
if passed in fields do not exist and warnings
|
||
if fields that do not have term vector options (termVectors, offsets, positions)
|
||
that align with the schema declaration. It also
|
||
will now return warnings about (gsingers)
|
||
|
||
* SOLR-1985: FastVectorHighlighter: add wrapper class for Lucene's SingleFragListBuilder (koji)
|
||
|
||
* SOLR-1984: Add HyphenationCompoundWordTokenFilterFactory. (PB via rmuir)
|
||
|
||
* SOLR-397: Date Faceting now supports a "facet.date.include" param
|
||
for specifying when the upper & lower end points of computed date
|
||
ranges should be included in the range. Legal values are: "all",
|
||
"lower", "upper", "edge", and "outer". For backwards compatibility
|
||
the default value is the set: [lower,upper,edge], so that all ranges
|
||
between start and end are inclusive of their endpoints, but the
|
||
"before" and "after" ranges are not.
|
||
|
||
* SOLR-945: JSON update handler that accepts add, delete, commit
|
||
commands in JSON format. (Ryan McKinley, yonik)
|
||
|
||
* SOLR-2015: Add a boolean attribute autoGeneratePhraseQueries to TextField.
|
||
autoGeneratePhraseQueries="true" (the default) causes the query parser to
|
||
generate phrase queries if multiple tokens are generated from a single
|
||
non-quoted analysis string. For example WordDelimiterFilter splitting text:pdp-11
|
||
will cause the parser to generate text:"pdp 11" rather than (text:PDP OR text:11).
|
||
Note that autoGeneratePhraseQueries="true" tends to not work well for non whitespace
|
||
delimited languages. (yonik)
|
||
|
||
* SOLR-1925: Add CSVResponseWriter (use wt=csv) that returns the list of documents
|
||
in CSV format. (Chris Mattmann, yonik)
|
||
|
||
* SOLR-1240: "Range Faceting" has been added. This is a generalization
|
||
of the existing "Date Faceting" logic so that it now supports any
|
||
all stock numeric field types that support range queries in addition
|
||
to dates. facet.date is now deprecated in favor of this generalized mechanism.
|
||
(Gijs Kunze, hossman)
|
||
|
||
* SOLR-2021: Add SolrEncoder plugin to Highlighter. (koji)
|
||
|
||
* SOLR-2030: Make FastVectorHighlighter use of SolrEncoder. (koji)
|
||
|
||
* SOLR-2053: Add support for custom comparators in Solr spellchecker, per LUCENE-2479 (gsingers)
|
||
|
||
* SOLR-2049: Add hl.multiValuedSeparatorChar for FastVectorHighlighter, per LUCENE-2603. (koji)
|
||
|
||
* SOLR-2059: Add "types" attribute to WordDelimiterFilterFactory, which
|
||
allows you to customize how WordDelimiterFilter tokenizes text with
|
||
a configuration file. (Peter Karich, rmuir)
|
||
|
||
* SOLR-2099: Add ability to throttle rsync based replication using rsync option --bwlimit.
|
||
(Brandon Evans via koji)
|
||
|
||
* SOLR-1316: Create autosuggest component.
|
||
(Ankul Garg, Jason Rutherglen, Shalin Shekhar Mangar, Grant Ingersoll, Robert Muir, ab)
|
||
|
||
* SOLR-1568: Added "native" filtering support for PointType, GeohashField. Added LatLonType with filtering support too. See
|
||
http://wiki.apache.org/solr/SpatialSearch and the example. Refactored some items in Lucene spatial.
|
||
Removed SpatialTileField as the underlying CartesianTier is broken beyond repair and is going to be moved. (gsingers)
|
||
|
||
* SOLR-2128: Full parameter substitution for function queries.
|
||
Example: q=add($v1,$v2)&v1=mul(popularity,5)&v2=20.0
|
||
(yonik)
|
||
|
||
* SOLR-2133: Function query parser can now parse multiple comma separated
|
||
value sources. It also now fails if there is extra unexpected text
|
||
after parsing the functions, instead of silently ignoring it.
|
||
This allows expressions like q=dist(2,vector(1,2),$pt)&pt=3,4 (yonik)
|
||
|
||
* SOLR-2157: Suggester should return alpha-sorted results when onlyMorePopular=false (ab)
|
||
|
||
* SOLR-2010: Added ability to verify that spell checking collations have
|
||
actual results in the index. (James Dyer via gsingers)
|
||
|
||
* SOLR-2188: Added "maxTokenLength" argument to the factories for ClassicTokenizer,
|
||
StandardTokenizer, and UAX29URLEmailTokenizer. (Steven Rowe)
|
||
|
||
* SOLR-2129: Added a Solr module for dynamic metadata extraction/indexing with Apache UIMA.
|
||
See contrib/uima/README.txt for more information. (Tommaso Teofili via rmuir)
|
||
|
||
* SOLR-2325: Allow tagging and exclusion of main query for faceting. (yonik)
|
||
|
||
* SOLR-2263: Add ability for RawResponseWriter to stream binary files as well as
|
||
text files. (Eric Pugh via yonik)
|
||
|
||
* SOLR-860: Add debug output for MoreLikeThis. (koji)
|
||
|
||
* SOLR-1057: Add PathHierarchyTokenizerFactory. (ryan, koji)
|
||
|
||
* SOLR-1804: Re-enabled clustering component on trunk, updated to latest
|
||
version of Carrot2. No more LGPL run-time dependencies. This release of
|
||
C2 also does not have a specific Lucene dependency.
|
||
(Stanislaw Osinski, gsingers)
|
||
|
||
* SOLR-2282: Add distributed search support for search result clustering.
|
||
(Brad Giaccio, Dawid Weiss, Stanislaw Osinski, rmuir, koji)
|
||
|
||
* SOLR-2210: Add icu-based tokenizer and filters to contrib/analysis-extras (rmuir)
|
||
|
||
* SOLR-1336: Add SmartChinese (word segmentation for Simplified Chinese)
|
||
tokenizer and filters to contrib/analysis-extras (rmuir)
|
||
|
||
* SOLR-2211,LUCENE-2763: Added UAX29URLEmailTokenizerFactory, which implements
|
||
UAX#29, a unicode algorithm with good results for most languages, as well as
|
||
URL and E-mail tokenization according to the relevant RFCs.
|
||
(Tom Burton-West via rmuir)
|
||
|
||
* SOLR-2237: Added StempelPolishStemFilterFactory to contrib/analysis-extras (rmuir)
|
||
|
||
* SOLR-1525: allow DIH to refer to core properties (noble)
|
||
|
||
* SOLR-1547: DIH TemplateTransformer copy objects more intelligently when the
|
||
template is a single variable (noble)
|
||
|
||
* SOLR-1627: DIH VariableResolver should be fetched just in time (noble)
|
||
|
||
* SOLR-1583: DIH Create DataSources that return InputStream (noble)
|
||
|
||
* SOLR-1358: Integration of Tika and DataImportHandler (Akshay Ukey, noble)
|
||
|
||
* SOLR-1654: TikaEntityProcessor example added DIHExample
|
||
(Akshay Ukey via noble)
|
||
|
||
* SOLR-1678: Move onError handling to DIH framework (noble)
|
||
|
||
* SOLR-1352: Multi-threaded implementation of DIH (noble)
|
||
|
||
* SOLR-1721: Add explicit option to run DataImportHandler in synchronous mode
|
||
(Alexey Serba via noble)
|
||
|
||
* SOLR-1737: Added FieldStreamDataSource (noble)
|
||
|
||
|
||
Optimizations
|
||
----------------------
|
||
|
||
* SOLR-1679: Don't build up string messages in SolrCore.execute unless they
|
||
are necessary for the current log level.
|
||
(Fuad Efendi and hossman)
|
||
|
||
* SOLR-1874: Optimize PatternReplaceFilter for better performance. (rmuir, uschindler)
|
||
|
||
* SOLR-1968: speed up initial filter cache population for facet.method=enum and
|
||
also big terms for multi-valued facet.method=fc. The resulting speedup
|
||
for the first facet request is anywhere from 30% to 32x, depending on how many
|
||
terms are in the field and how many documents match per term. (yonik)
|
||
|
||
* SOLR-2089: Speed up UnInvertedField faceting (facet.method=fc for
|
||
multi-valued fields) when facet.limit is both high, and a high enough
|
||
percentage of the number of unique terms in the field. Extreme cases
|
||
yield speedups over 3x. (yonik)
|
||
|
||
* SOLR-2046: add common functions to scripts-util. (koji)
|
||
|
||
* SOLR-1684: Switch clustering component to use the
|
||
SolrIndexSearcher.doc(int, Set<String>) method b/c it can use the document
|
||
cache (gsingers)
|
||
|
||
* SOLR-2200: Improve the performance of DataImportHandler for large
|
||
delta-import updates. (Mark Waddle via rmuir)
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
* SOLR-1769: Solr 1.4 Replication - Repeater throwing NullPointerException (Jörgen Rydenius via noble)
|
||
|
||
* SOLR-1432: Make the new ValueSource.getValues(context,reader) delegate
|
||
to the original ValueSource.getValues(reader) so custom sources
|
||
will work. (yonik)
|
||
|
||
* SOLR-1572: FastLRUCache correctly implemented the LRU policy only
|
||
for the first 2B accesses. (yonik)
|
||
|
||
* SOLR-1582: copyField was ignored for BinaryField types (gsingers)
|
||
|
||
* SOLR-1563: Binary fields, including trie-based numeric fields, caused null
|
||
pointer exceptions in the luke request handler. (yonik)
|
||
|
||
* SOLR-1577: The example solrconfig.xml defaulted to a solr data dir
|
||
relative to the current working directory, even if a different solr home
|
||
was being used. The new behavior changes the default to a zero length
|
||
string, which is treated the same as if no dataDir had been specified,
|
||
hence the "data" directory under the solr home will be used. (yonik)
|
||
|
||
* SOLR-1584: SolrJ - SolrQuery.setIncludeScore() incorrectly added
|
||
fl=score to the parameter list instead of appending score to the
|
||
existing field list. (yonik)
|
||
|
||
* SOLR-1580: Solr Configuration ignores 'mergeFactor' parameter, always
|
||
uses Lucene default. (Lance Norskog via Mark Miller)
|
||
|
||
* SOLR-1593: ReverseWildcardFilter didn't work for surrogate pairs
|
||
(i.e. code points outside of the BMP), resulting in incorrect
|
||
matching. This change requires reindexing for any content with
|
||
such characters. (Robert Muir, yonik)
|
||
|
||
* SOLR-1596: A rollback operation followed by the shutdown of Solr
|
||
or the close of a core resulted in a warning:
|
||
"SEVERE: SolrIndexWriter was not closed prior to finalize()" although
|
||
there were no other consequences. (yonik)
|
||
|
||
* SOLR-1595: StreamingUpdateSolrServer used the platform default character
|
||
set when streaming updates, rather than using UTF-8 as the HTTP headers
|
||
indicated, leading to an encoding mismatch. (hossman, yonik)
|
||
|
||
* SOLR-1587: A distributed search request with fl=score, didn't match
|
||
the behavior of a non-distributed request since it only returned
|
||
the id,score fields instead of all fields in addition to score. (yonik)
|
||
|
||
* SOLR-1601: Schema browser does not indicate presence of charFilter. (koji)
|
||
|
||
* SOLR-1615: Backslash escaping did not work in quoted strings
|
||
for local param arguments. (Wojtek Piaseczny, yonik)
|
||
|
||
* SOLR-1628: log contains incorrect number of adds and deletes.
|
||
(Thijs Vonk via yonik)
|
||
|
||
* SOLR-343: Date faceting now respects facet.mincount limiting
|
||
(Uri Boness, Raiko Eckstein via hossman)
|
||
|
||
* SOLR-1624: Highlighter only highlights values from the first field value
|
||
in a multivalued field when term positions (term vectors) are stored.
|
||
(Chris Harris via yonik)
|
||
|
||
* SOLR-1635: Fixed error message when numeric values can't be parsed by
|
||
DOMUtils - notably for plugin init params in solrconfig.xml.
|
||
(hossman)
|
||
|
||
* SOLR-1651: Fixed Incorrect dataimport handler package name in SolrResourceLoader
|
||
(Akshay Ukey via shalin)
|
||
|
||
* SOLR-1660: CapitalizationFilter crashes if you use the maxWordCountOption
|
||
(Robert Muir via shalin)
|
||
|
||
* SOLR-1667: PatternTokenizer does not reset attributes such as positionIncrementGap
|
||
(Robert Muir via shalin)
|
||
|
||
* SOLR-1711: SolrJ - StreamingUpdateSolrServer had a race condition that
|
||
could halt the streaming of documents. The original patch to fix this
|
||
(never officially released) introduced another hanging bug due to
|
||
connections not being released.
|
||
(Attila Babo, Erik Hetzner, Johannes Tuchscherer via yonik)
|
||
|
||
* SOLR-1748, SOLR-1747, SOLR-1746, SOLR-1745, SOLR-1744: Streams and Readers
|
||
retrieved from ContentStreams are not closed in various places, resulting
|
||
in file descriptor leaks.
|
||
(Christoff Brill, Mark Miller)
|
||
|
||
* SOLR-1753: StatsComponent throws NPE when getting statistics for facets in distributed search
|
||
(Janne Majaranta via koji)
|
||
|
||
* SOLR-1736:In the slave , If 'mov'ing file does not succeed , copy the file (noble)
|
||
|
||
* SOLR-1579: Fixes to XML escaping in stats.jsp
|
||
(David Bowen and hossman)
|
||
|
||
* SOLR-1777: fieldTypes with sortMissingLast=true or sortMissingFirst=true can
|
||
result in incorrectly sorted results. (yonik)
|
||
|
||
* SOLR-1798: Small memory leak (~100 bytes) in fastLRUCache for every
|
||
commit. (yonik)
|
||
|
||
* SOLR-1823: Fixed XMLResponseWriter (via XMLWriter) so it no longer throws
|
||
a ClassCastException when a Map containing a non-String key is used.
|
||
(Frank Wesemann, hossman)
|
||
|
||
* SOLR-1797: fix ConcurrentModificationException and potential memory
|
||
leaks in ResourceLoader. (yonik)
|
||
|
||
* SOLR-1850: change KeepWordFilter so a new word set is not created for
|
||
each instance (John Wang via yonik)
|
||
|
||
* SOLR-1706: fixed WordDelimiterFilter for certain combinations of options
|
||
where it would output incorrect tokens. (Robert Muir, Chris Male)
|
||
|
||
* SOLR-1936: The JSON response format needed to escape unicode code point
|
||
U+2028 - 'LINE SEPARATOR' (Robert Hofstra, yonik)
|
||
|
||
* SOLR-1914: Change the JSON response format to output float/double
|
||
values of NaN,Infinity,-Infinity as strings. (yonik)
|
||
|
||
* SOLR-1948: PatternTokenizerFactory should use parent's args (koji)
|
||
|
||
* SOLR-1870: Indexing documents using the 'javabin' format no longer
|
||
fails with a ClassCastException whenSolrInputDocuments contain field
|
||
values which are Collections or other classes that implement
|
||
Iterable. (noble, hossman)
|
||
|
||
* SOLR-1981: Solr will now fail correctly if solr.xml attempts to
|
||
specify multiple cores that have the same name (hossman)
|
||
|
||
* SOLR-1791: Fix messed up core names on admin gui (yonik via koji)
|
||
|
||
* SOLR-1995: Change date format from "hour in am/pm" to "hour in day"
|
||
in CoreContainer and SnapShooter. (Hayato Ito, koji)
|
||
|
||
* SOLR-2008: avoid possible RejectedExecutionException w/autoCommit
|
||
by making SolreCore close the UpdateHandler before closing the
|
||
SearchExecutor. (NarasimhaRaju, hossman)
|
||
|
||
* SOLR-2036: Avoid expensive fieldCache ram estimation for the
|
||
admin stats page. (yonik)
|
||
|
||
* SOLR-2047: ReplicationHandler should accept bool type for enable flag. (koji)
|
||
|
||
* SOLR-1630: Fix spell checking collation issue related to token positions (rmuir, gsingers)
|
||
|
||
* SOLR-2100: The replication handler backup command didn't save the commit
|
||
point and hence could fail when a newer commit caused the older commit point
|
||
to be removed before it was finished being copied. This did not affect
|
||
normal master/slave replication. (Peter Sturge via yonik)
|
||
|
||
* SOLR-2114: Fixed parsing error in hsin function. The function signature has changed slightly. (gsingers)
|
||
|
||
* SOLR-2083: SpellCheckComponent misreports suggestions when distributed (James Dyer via gsingers)
|
||
|
||
* SOLR-2111: Change exception handling in distributed faceting to work more
|
||
like non-distributed faceting, change facet_counts/exception from a String
|
||
to a List<String> to enable listing all exceptions that happened, and
|
||
prevent an exception in one facet command from affecting another
|
||
facet command. (yonik)
|
||
|
||
* SOLR-2110: Remove the restriction on names for local params
|
||
substitution/dereferencing. Properly encode local params in
|
||
distributed faceting. (yonik)
|
||
|
||
* SOLR-2135: Fix behavior of ConcurrentLRUCache when asking for
|
||
getLatestAccessedItems(0) or getOldestAccessedItems(0).
|
||
(David Smiley via hossman)
|
||
|
||
* SOLR-2148: Highlighter doesn't support q.alt. (koji)
|
||
|
||
* SOLR-2180: It was possible for EmbeddedSolrServer to leave searchers
|
||
open if a request threw an exception. (yonik)
|
||
|
||
* SOLR-2173: Suggester should always rebuild Lookup data if Lookup.load fails. (ab)
|
||
|
||
* SOLR-2081: BaseResponseWriter.isStreamingDocs causes
|
||
SingleResponseWriter.end to be called 2x
|
||
(Chris A. Mattmann via hossman)
|
||
|
||
* SOLR-2219: The init() method of every SolrRequestHandler was being
|
||
called twice. (ambikeshwar singh and hossman)
|
||
|
||
* SOLR-2285: duplicate SolrEventListeners no longer created (hossman)
|
||
|
||
* SOLR-1993: fix String cast assumption in JavaBinCodec - specific
|
||
addresses "commitWithin" option on Update requests.
|
||
(noble, hossman, and Maxim Valyanskiy)
|
||
|
||
* SOLR-2261: fix velocity template layout.vm that referred to an older
|
||
version of jquery. (Eric Pugh via rmuir)
|
||
|
||
* SOLR-2307: fix bug in PHPSerializedResponseWriter (wt=phps) when
|
||
dealing with SolrDocumentList objects -- ie: sharded queries.
|
||
(Antonio Verni via hossman)
|
||
|
||
* SOLR-2127: Fixed serialization of default core and indentation of solr.xml when serializing.
|
||
(Ephraim Ofir, Mark Miller)
|
||
|
||
* SOLR-2320: Fixed ReplicationHandler detail reporting for masters
|
||
(hossman)
|
||
|
||
* SOLR-482: Provide more exception handling in CSVLoader (gsingers)
|
||
|
||
* SOLR-1283: HTMLStripCharFilter sometimes threw a "Mark Invalid" exception.
|
||
(Julien Coloos, hossman, yonik)
|
||
|
||
* SOLR-2085: Improve SolrJ behavior when FacetComponent comes before
|
||
QueryComponent (Tomas Salfischberger via hossman)
|
||
|
||
* SOLR-1940: Fix SolrDispatchFilter behavior when Content-Type is
|
||
unknown (Lance Norskog and hossman)
|
||
|
||
* SOLR-1983: snappuller fails when modifiedConfFiles is not empty and
|
||
full copy of index is needed. (Alexander Kanarsky via yonik)
|
||
|
||
* SOLR-2156: SnapPuller fails to clean Old Index Directories on Full Copy
|
||
(Jayendra Patil via yonik)
|
||
|
||
* SOLR-96: Fix XML parsing in XMLUpdateRequestHandler and
|
||
DocumentAnalysisRequestHandler to respect charset from XML file and only
|
||
use HTTP header's "Content-Type" as a "hint". (uschindler)
|
||
|
||
* SOLR-2339: Fix sorting to explicitly generate an error if you
|
||
attempt to sort on a multiValued field. (hossman)
|
||
|
||
* SOLR-2348: Fix field types to explicitly generate an error if you
|
||
attempt to get a ValueSource for a multiValued field. (hossman)
|
||
|
||
* SOLR-2380: Distributed faceting could miss values when facet.sort=index
|
||
and when facet.offset was greater than 0. (yonik)
|
||
|
||
* SOLR-1656: XIncludes and other HREFs in XML files loaded by ResourceLoader
|
||
are fixed to be resolved using the URI standard (RFC 2396). The system
|
||
identifier is no longer a plain filename with path, it gets initialized
|
||
using a custom URI scheme "solrres:". This scheme is resolved using a
|
||
EntityResolver that utilizes ResourceLoader
|
||
(org.apache.solr.common.util.SystemIdResolver). This makes all relative
|
||
pathes in Solr's config files behave like expected. This change
|
||
introduces some backwards breaks in the API: Some config classes
|
||
(Config, SolrConfig, IndexSchema) were changed to take
|
||
org.xml.sax.InputSource instead of InputStream. There may also be some
|
||
backwards breaks in existing config files, it is recommended to check
|
||
your config files / XSLTs and replace all XIncludes/HREFs that were
|
||
hacked to use absolute paths to use relative ones. (uschindler)
|
||
|
||
* SOLR-309: Fix FieldType so setting an analyzer on a FieldType that
|
||
doesn't expect it will generate an error. Practically speaking this
|
||
means that Solr will now correctly generate an error on
|
||
initialization if the schema.xml contains an analyzer configuration
|
||
for a fieldType that does not use TextField. (hossman)
|
||
|
||
* SOLR-2192: StreamingUpdateSolrServer.blockUntilFinished was not
|
||
thread safe and could throw an exception. (yonik)
|
||
|
||
* SOLR-1692: Fix bug in clustering component relating to carrot.produceSummary
|
||
option (gsingers)
|
||
|
||
* SOLR-1756: The date.format setting for extraction request handler causes
|
||
ClassCastException when enabled and the config code that parses this setting
|
||
does not properly use the same iterator instance.
|
||
(Christoph Brill, Mark Miller)
|
||
|
||
* SOLR-1638: Fixed NullPointerException during DIH import if uniqueKey is not
|
||
specified in schema (Akshay Ukey via shalin)
|
||
|
||
* SOLR-1639: Fixed misleading error message when dataimport.properties is not
|
||
writable (shalin)
|
||
|
||
* SOLR-1598: DIH: Reader used in PlainTextEntityProcessor is not explicitly
|
||
closed (Sascha Szott via noble)
|
||
|
||
* SOLR-1759: DIH: $skipDoc was not working correctly
|
||
(Gian Marco Tagliani via noble)
|
||
|
||
* SOLR-1762: DIH: DateFormatTransformer does not work correctly with
|
||
non-default locale dates (tommy chheng via noble)
|
||
|
||
* SOLR-1757: DIH multithreading sometimes throws NPE (noble)
|
||
|
||
* SOLR-1766: DIH with threads enabled doesn't respond to the abort command
|
||
(Michael Henson via noble)
|
||
|
||
* SOLR-1767: dataimporter.functions.escapeSql() does not escape backslash
|
||
character (Sean Timm via noble)
|
||
|
||
* SOLR-1811: formatDate should use the current NOW value always
|
||
(Sean Timm via noble)
|
||
|
||
* SOLR-1794: Dataimport of CLOB fields fails when getCharacterStream() is
|
||
defined in a superclass. (Gunnar Gauslaa Bergem via rmuir)
|
||
|
||
* SOLR-2057: DataImportHandler never calls UpdateRequestProcessor.finish()
|
||
(Drew Farris via koji)
|
||
|
||
* SOLR-1973: Empty fields in XML update messages confuse DataImportHandler.
|
||
(koji)
|
||
|
||
* SOLR-2221: Use StrUtils.parseBool() to get values of boolean options in DIH.
|
||
true/on/yes (for TRUE) and false/off/no (for FALSE) can be used for
|
||
sub-options (debug, verbose, synchronous, commit, clean, optimize) for
|
||
full/delta-import commands. (koji)
|
||
|
||
* SOLR-2310: DIH: getTimeElapsedSince() returns incorrect hour value when
|
||
the elapse is over 60 hours (tom liu via koji)
|
||
|
||
* SOLR-2252: DIH: When a child entity in nested entities is rootEntity="true",
|
||
delta-import doesn't work. (koji)
|
||
|
||
* SOLR-2330: solrconfig.xml files in example-DIH are broken. (Matt Parker, koji)
|
||
|
||
* SOLR-1191: resolve DataImportHandler deltaQuery column against pk when pk
|
||
has a prefix (e.g. pk="book.id" deltaQuery="select id from ..."). More
|
||
useful error reporting when no match found (previously failed with a
|
||
NullPointerException in log and no clear user feedback). (gthb via yonik)
|
||
|
||
* SOLR-2116: Fix TikaConfig classloader bug in TikaEntityProcessor
|
||
(Martijn van Groningen via hossman)
|
||
|
||
Other Changes
|
||
----------------------
|
||
|
||
* SOLR-1602: Refactor SOLR package structure to include o.a.solr.response
|
||
and move QueryResponseWriters in there
|
||
(Chris A. Mattmann, ryan, hoss)
|
||
|
||
* SOLR-1516: Addition of an abstract BaseResponseWriter class to simplify the
|
||
development of QueryResponseWriter implementations.
|
||
(Chris A. Mattmann via noble)
|
||
|
||
* SOLR-1592: Refactor XMLWriter startTag to allow arbitrary attributes to be written
|
||
(Chris A. Mattmann via noble)
|
||
|
||
* SOLR-1561: Added Lucene 2.9.1 spatial contrib jar to lib. (gsingers)
|
||
|
||
* SOLR-1570: Log warnings if uniqueKey is multi-valued or not stored (hossman, shalin)
|
||
|
||
* SOLR-1558: QueryElevationComponent only works if the uniqueKey field is
|
||
implemented using StrField. In previous versions of Solr no warning or
|
||
error would be generated if you attempted to use QueryElevationComponent,
|
||
it would just fail in unexpected ways. This has been changed so that it
|
||
will fail with a clear error message on initialization. (hossman)
|
||
|
||
* SOLR-1611: Added Lucene 2.9.1 collation contrib jar to lib (shalin)
|
||
|
||
* SOLR-1608: Extract base class from TestDistributedSearch to make
|
||
it easy to write test cases for other distributed components. (shalin)
|
||
|
||
* Upgraded to Lucene 2.9-dev r888785 (shalin)
|
||
|
||
* SOLR-1610: Generify SolrCache (Jason Rutherglen via shalin)
|
||
|
||
* SOLR-1637: Remove ALIAS command
|
||
|
||
* SOLR-1662: Added Javadocs in BufferedTokenStream and fixed incorrect cloning
|
||
in TestBufferedTokenStream (Robert Muir, Uwe Schindler via shalin)
|
||
|
||
* SOLR-1674: Improve analysis tests and cut over to new TokenStream API.
|
||
(Robert Muir via Mark Miller)
|
||
|
||
* SOLR-1661: Remove adminCore from CoreContainer . removed deprecated methods setAdminCore(), getAdminCore() (noble)
|
||
|
||
* SOLR-1704: Google collections moved from clustering to core (noble)
|
||
|
||
* SOLR-1268: Add Lucene 2.9-dev r888785 FastVectorHighlighter contrib jar to lib. (koji)
|
||
|
||
* SOLR-1538: Reordering of object allocations in ConcurrentLRUCache to eliminate
|
||
(an extremely small) potential for deadlock.
|
||
(gabriele renzi via hossman)
|
||
|
||
* SOLR-1588: Removed some very old dead code.
|
||
(Chris A. Mattmann via hossman)
|
||
|
||
* SOLR-1696 : Deprecate old <highlighting> syntax and move configuration to HighlightComponent (noble)
|
||
|
||
* SOLR-1727: SolrEventListener should extend NamedListInitializedPlugin (noble)
|
||
|
||
* SOLR-1771: Improved error message when StringIndex cannot be initialized
|
||
for a function query (hossman)
|
||
|
||
* SOLR-1695: Improved error messages when adding a document that does not
|
||
contain exactly one value for the uniqueKey field (hossman)
|
||
|
||
* SOLR-1776: DismaxQParser and ExtendedDismaxQParser now use the schema.xml
|
||
"defaultSearchField" as the default value for the "qf" param instead of failing
|
||
with an error when "qf" is not specified. (hossman)
|
||
|
||
* SOLR-1851: luceneAutoCommit no longer has any effect - it has been remove (Mark Miller)
|
||
|
||
* SOLR-1865: SolrResourceLoader.getLines ignores Byte Order Markers (BOMs) at the
|
||
beginning of input files, these are often created by editors such as Windows
|
||
Notepad. (rmuir, hossman)
|
||
|
||
* SOLR-1938: ElisionFilterFactory will use a default set of French contractions
|
||
if you do not supply a custom articles file. (rmuir)
|
||
|
||
* SOLR-2003: SolrResourceLoader will report any encoding errors, rather than
|
||
silently using replacement characters for invalid inputs (blargy via rmuir)
|
||
|
||
* SOLR-1804: Google collections updated to Google Guava (which is a superset of collections and contains bug fixes) (gsingers)
|
||
|
||
* SOLR-2034: Switch to JavaBin codec version 2. Strings are now serialized
|
||
as the number of UTF-8 bytes, followed by the bytes in UTF-8. Previously
|
||
Strings were serialized as the number of UTF-16 chars, followed by the
|
||
bytes in Modified UTF-8. (hossman, yonik, rmuir)
|
||
|
||
* SOLR-2013: Add mapping-FoldToASCII.txt to example conf directory.
|
||
(Steven Rowe via koji)
|
||
|
||
* SOLR-2213: Upgrade to jQuery 1.4.3 (Erick Erickson via ryan)
|
||
|
||
* SOLR-1826: Add unit tests for highlighting with termOffsets=true
|
||
and overlapping tokens. (Stefan Oestreicher via rmuir)
|
||
|
||
* SOLR-2340: Add version infos to message in JavaBinCodec when throwing
|
||
exception. (koji)
|
||
|
||
* SOLR-2350: Since Solr no longer requires XML files to be in UTF-8
|
||
(see SOLR-96) SimplePostTool (aka: post.jar) has been improved to
|
||
work with files of any mime-type or charset. (hossman)
|
||
|
||
* SOLR-2365: Move DIH jars out of solr.war (David Smiley via yonik)
|
||
|
||
* SOLR-2381: Include a patched version of Jetty (6.1.26 + JETTY-1340)
|
||
to fix problematic UTF-8 handling for supplementary characters.
|
||
(Bernd Fehling, uschindler, yonik, rmuir)
|
||
|
||
* SOLR-2391: The preferred Content-Type for XML was changed to
|
||
application/xml. XMLResponseWriter now only delivers using this
|
||
type; updating documents and analyzing documents is still supported
|
||
using text/xml as Content-Type, too. If you have clients that are
|
||
hardcoded on text/xml as Content-Type, you have to change them.
|
||
(uschindler, rmuir)
|
||
|
||
* SOLR-2414: All ResponseWriters now use only ServletOutputStreams
|
||
and wrap their own Writer around it when serializing. This fixes
|
||
the bug in PHPSerializedResponseWriter that produced wrong string
|
||
length if the servlet container had a broken UTF-8 encoding that was
|
||
in fact CESU-8 (see SOLR-1091). The system property to enable the
|
||
CESU-8 byte counting in PHPSerializesResponseWriters for broken
|
||
servlet containers was therefore removed and is now ignored if set.
|
||
Output is always UTF-8. (uschindler, yonik, rmuir)
|
||
|
||
* SOLR-141: Errors and Exceptions are formated by ResponseWriter.
|
||
(Mike Sokolov, Rich Cariens, Daniel Naber, ryan)
|
||
|
||
* SOLR-1902: Upgraded to Tika 0.8 and changed deprecated parse call
|
||
|
||
* SOLR-1813: Add ICU4j to contrib/extraction libs and add tests for Arabic
|
||
extraction (Robert Muir via gsingers)
|
||
|
||
* SOLR-1821: Fix TimeZone-dependent test failure in TestEvaluatorBag.
|
||
(Chris Male via rmuir)
|
||
|
||
* SOLR-2367: Reduced noise in test output by ensuring the properties file
|
||
can be written. (Gunnlaugur Thor Briem via rmuir)
|
||
|
||
Build
|
||
----------------------
|
||
|
||
* SOLR-1522: Automated release signing process. (gsingers)
|
||
|
||
* SOLR-1891: Make lucene-jars-to-solr fail if copying any of the jars fails, and
|
||
update clean to remove the jars in that directory (Mark Miller)
|
||
|
||
* LUCENE-2466: Commons-Codec was upgraded from 1.3 to 1.4. (rmuir)
|
||
|
||
* SOLR-2042: Fixed some Maven deps (Drew Farris via gsingers)
|
||
|
||
* LUCENE-2657: Switch from using Maven POM templates to full POMs when
|
||
generating Maven artifacts (Steven Rowe)
|
||
|
||
Documentation
|
||
----------------------
|
||
|
||
* SOLR-1590: Javadoc for XMLWriter#startTag
|
||
(Chris A. Mattmann via hossman)
|
||
|
||
* SOLR-1792: Documented peculiar behavior of TestHarness.LocalRequestFactory
|
||
(hossman)
|
||
|
||
================== Release 1.4.0 ==================
|
||
Release Date: See http://lucene.apache.org/solr for the official release date.
|
||
|
||
Upgrading from Solr 1.3
|
||
-----------------------
|
||
|
||
There is a new default faceting algorithm for multiVaued fields that should be
|
||
faster for most cases. One can revert to the previous algorithm (which has
|
||
also been improved somewhat) by adding facet.method=enum to the request.
|
||
|
||
Searching and sorting is now done on a per-segment basis, meaning that
|
||
the FieldCache entries used for sorting and for function queries are
|
||
created and used per-segment and can be reused for segments that don't
|
||
change between index updates. While generally beneficial, this can lead
|
||
to increased memory usage over 1.3 in certain scenarios:
|
||
1) A single valued field that was used for both sorting and faceting
|
||
in 1.3 would have used the same top level FieldCache entry. In 1.4,
|
||
sorting will use entries at the segment level while faceting will still
|
||
use entries at the top reader level, leading to increased memory usage.
|
||
2) Certain function queries such as ord() and rord() require a top level
|
||
FieldCache instance and can thus lead to increased memory usage. Consider
|
||
replacing ord() and rord() with alternatives, such as function queries
|
||
based on ms() for date boosting.
|
||
|
||
If you use custom Tokenizer or TokenFilter components in a chain specified in
|
||
schema.xml, they must support reusability. If your Tokenizer or TokenFilter
|
||
maintains state, it should implement reset(). If your TokenFilteFactory does
|
||
not return a subclass of TokenFilter, then it should implement reset() and call
|
||
reset() on it's input TokenStream. TokenizerFactory implementations must
|
||
now return a Tokenizer rather than a TokenStream.
|
||
|
||
New users of Solr 1.4 will have omitTermFreqAndPositions enabled for non-text
|
||
indexed fields by default, which avoids indexing term frequency, positions, and
|
||
payloads, making the index smaller and faster. If you are upgrading from an
|
||
earlier Solr release and want to enable omitTermFreqAndPositions by default,
|
||
change the schema version from 1.1 to 1.2 in schema.xml. Remove any existing
|
||
index and restart Solr to ensure that omitTermFreqAndPositions completely takes
|
||
affect.
|
||
|
||
The default QParserPlugin used by the QueryComponent for parsing the "q" param
|
||
has been changed, to remove support for the deprecated use of ";" as a separator
|
||
between the query string and the sort options when no "sort" param was used.
|
||
Users who wish to continue using the semi-colon based method of specifying the
|
||
sort options should explicitly set the defType param to "lucenePlusSort" on all
|
||
requests. (The simplest way to do this is by specifying it as a default param
|
||
for your request handlers in solrconfig.xml, see the example solrconfig.xml for
|
||
sample syntax.)
|
||
|
||
If spellcheck.extendedResults=true, the response format for suggestions
|
||
has changed, see SOLR-1071.
|
||
|
||
Use of the "charset" option when configuring the following Analysis
|
||
Factories has been deprecated and will cause a warning to be logged.
|
||
In future versions of Solr attempting to use this option will cause an
|
||
error. See SOLR-1410 for more information.
|
||
- GreekLowerCaseFilterFactory
|
||
- RussianStemFilterFactory
|
||
- RussianLowerCaseFilterFactory
|
||
- RussianLetterTokenizerFactory
|
||
|
||
DIH: Evaluator API has been changed in a non back-compatible way. Users who
|
||
have developed custom Evaluators will need to change their code according to
|
||
the new API for it to work. See SOLR-996 for details.
|
||
|
||
DIH: The formatDate evaluator's syntax has been changed. The new syntax is
|
||
formatDate(<variable>, '<format_string>'). For example,
|
||
formatDate(x.date, 'yyyy-MM-dd'). In the old syntax, the date string was
|
||
written without a single-quotes. The old syntax has been deprecated and will
|
||
be removed in 1.5, until then, using the old syntax will log a warning.
|
||
|
||
DIH: The Context API has been changed in a non back-compatible way. In
|
||
particular, the Context.currentProcess() method now returns a String
|
||
describing the type of the current import process instead of an int.
|
||
Similarily, the public constants in Context viz. FULL_DUMP, DELTA_DUMP and
|
||
FIND_DELTA are changed to a String type. See SOLR-969 for details.
|
||
|
||
DIH: The EntityProcessor API has been simplified by moving logic for applying
|
||
transformers and handling multi-row outputs from Transformers into an
|
||
EntityProcessorWrapper class. The EntityProcessor#destroy is now called once
|
||
per parent-row at the end of row (end of data). A new method
|
||
EntityProcessor#close is added which is called at the end of import.
|
||
|
||
DIH: In Solr 1.3, if the last_index_time was not available (first import) and
|
||
a delta-import was requested, a full-import was run instead. This is no longer
|
||
the case. In Solr 1.4 delta import is run with last_index_time as the epoch
|
||
date (January 1, 1970, 00:00:00 GMT) if last_index_time is not available.
|
||
|
||
Versions of Major Components
|
||
----------------------------
|
||
Apache Lucene 2.9.1 (r832363 on 2.9 branch)
|
||
Apache Tika 0.4
|
||
Carrot2 3.1.0
|
||
|
||
Lucene Information
|
||
----------------
|
||
|
||
Since Solr is built on top of Lucene, many people add customizations to Solr
|
||
that are dependent on Lucene. Please see http://lucene.apache.org/java/2_9_0/,
|
||
especially http://lucene.apache.org/java/2_9_0/changes/Changes.html for more
|
||
information on the version of Lucene used in Solr.
|
||
|
||
Detailed Change List
|
||
----------------------
|
||
|
||
New Features
|
||
----------------------
|
||
1. SOLR-560: Use SLF4J logging API rather then JDK logging. The packaged .war file is
|
||
shipped with a JDK logging implementation, so logging configuration for the .war should
|
||
be identical to solr 1.3. However, if you are using the .jar file, you can select
|
||
which logging implementation to use by dropping a different binding.
|
||
See: http://www.slf4j.org/ (ryan)
|
||
|
||
2. SOLR-617: Allow configurable index deletion policy and provide a default implementation which
|
||
allows deletion of commit points on various criteria such as number of commits, age of commit
|
||
point and optimized status.
|
||
See http://lucene.apache.org/java/2_3_2/api/org/apache/lucene/index/IndexDeletionPolicy.html
|
||
(yonik, Noble Paul, Akshay Ukey via shalin)
|
||
|
||
3. SOLR-658: Allow Solr to load index from arbitrary directory in dataDir
|
||
(Noble Paul, Akshay Ukey via shalin)
|
||
|
||
4. SOLR-793: Add 'commitWithin' argument to the update add command. This behaves
|
||
similar to the global autoCommit maxTime argument except that it is set for
|
||
each request. (ryan)
|
||
|
||
5. SOLR-670: Add support for rollbacks in UpdateHandler. This allows user to rollback all changes
|
||
since the last commit. (Noble Paul, koji via shalin)
|
||
|
||
6. SOLR-813: Adding DoubleMetaphone Filter and Factory. Similar to the PhoneticFilter,
|
||
but this uses DoubleMetaphone specific calls (including alternate encoding)
|
||
(Todd Feak via ryan)
|
||
|
||
7. SOLR-680: Add StatsComponent. This gets simple statistics on matched numeric fields,
|
||
including: min, max, mean, median, stddev. (koji, ryan)
|
||
|
||
- SOLR-1380: Added support for multi-valued fields (Harish Agarwal via gsingers)
|
||
|
||
8. SOLR-561: Added Replication implemented in Java as a request handler. Supports index replication
|
||
as well as configuration replication and exposes detailed statistics and progress information
|
||
on the Admin page. Works on all platforms. (Noble Paul, yonik, Akshay Ukey, shalin)
|
||
|
||
9. SOLR-746: Added "omitHeader" request parameter to omit the header from the response.
|
||
(Noble Paul via shalin)
|
||
|
||
10. SOLR-651: Added TermVectorComponent for serving up term vector information, plus IDF.
|
||
See http://wiki.apache.org/solr/TermVectorComponent (gsingers, Vaijanath N. Rao, Noble Paul)
|
||
|
||
12. SOLR-795: SpellCheckComponent supports building indices on optimize if configured in solrconfig.xml
|
||
(Jason Rennie, shalin)
|
||
|
||
13. SOLR-667: A LRU cache implementation based upon ConcurrentHashMap and other techniques to reduce
|
||
contention and synchronization overhead, to utilize multiple CPU cores more effectively.
|
||
(Fuad Efendi, Noble Paul, yonik via shalin)
|
||
|
||
14. SOLR-465: Add configurable DirectoryProvider so that alternate Directory
|
||
implementations can be specified via solrconfig.xml. The default
|
||
DirectoryProvider will use NIOFSDirectory for better concurrency
|
||
on non Windows platforms. (Mark Miller, TJ Laurenzo via yonik)
|
||
|
||
15. SOLR-822: Add CharFilter so that characters can be filtered (e.g. character normalization)
|
||
before Tokenizer/TokenFilters. (koji)
|
||
|
||
16. SOLR-829: Allow slaves to request compressed files from master during replication
|
||
(Simon Collins, Noble Paul, Akshay Ukey via shalin)
|
||
|
||
17. SOLR-877: Added TermsComponent for accessing Lucene's TermEnum capabilities.
|
||
Useful for auto suggest and possibly distributed search. Not distributed search compliant. (gsingers)
|
||
- Added mincount and maxcount options (Khee Chin via gsingers)
|
||
|
||
18. SOLR-538: Add maxChars attribute for copyField function so that the length limit for destination
|
||
can be specified.
|
||
(Georgios Stamatis, Lars Kotthoff, Chris Harris via koji)
|
||
|
||
19. SOLR-284: Added support for extracting content from binary documents like MS Word and PDF using Apache Tika. See also contrib/extraction/CHANGES.txt (Eric Pugh, Chris Harris, yonik, gsingers)
|
||
|
||
20. SOLR-819: Added factories for Arabic support (gsingers)
|
||
|
||
21. SOLR-781: Distributed search ability to sort field.facet values
|
||
lexicographically. facet.sort values "true" and "false" are
|
||
also deprecated and replaced with "count" and "lex".
|
||
(Lars Kotthoff via yonik)
|
||
|
||
22. SOLR-821: Add support for replication to copy conf file to slave with a different name. This allows replication
|
||
of solrconfig.xml
|
||
(Noble Paul, Akshay Ukey via shalin)
|
||
|
||
23. SOLR-911: Add support for multi-select faceting by allowing filters to be
|
||
tagged and facet commands to exclude certain filters. This patch also
|
||
added the ability to change the output key for facets in the response, and
|
||
optimized distributed faceting refinement by lowering parsing overhead and
|
||
by making requests and responses smaller.
|
||
|
||
24. SOLR-876: WordDelimiterFilter now supports a splitOnNumerics
|
||
option, as well as a list of protected terms.
|
||
(Dan Rosher via hossman)
|
||
|
||
25. SOLR-928: SolrDocument and SolrInputDocument now implement the Map<String,?>
|
||
interface. This should make plugging into other standard tools easier. (ryan)
|
||
|
||
26. SOLR-847: Enhance the snappull command in ReplicationHandler to accept masterUrl.
|
||
(Noble Paul, Preetam Rao via shalin)
|
||
|
||
27. SOLR-540: Add support for globbing in field names to highlight.
|
||
For example, hl.fl=*_text will highlight all fieldnames ending with
|
||
_text. (Lars Kotthoff via yonik)
|
||
|
||
28. SOLR-906: Adding a StreamingUpdateSolrServer that writes update commands to
|
||
an open HTTP connection. If you are using solrj for bulk update requests
|
||
you should consider switching to this implementaion. However, note that
|
||
the error handling is not immediate as it is with the standard SolrServer.
|
||
(ryan)
|
||
|
||
29. SOLR-865: Adding support for document updates in binary format and corresponding support in Solrj client.
|
||
(Noble Paul via shalin)
|
||
|
||
30. SOLR-763: Add support for Lucene's PositionFilter (Mck SembWever via shalin)
|
||
|
||
31. SOLR-966: Enhance the map() function query to take in an optional default value (Noble Paul, shalin)
|
||
|
||
32. SOLR-820: Support replication on startup of master with new index. (Noble Paul, Akshay Ukey via shalin)
|
||
|
||
33. SOLR-943: Make it possible to specify dataDir in solr.xml and accept the dataDir as a request parameter for
|
||
the CoreAdmin create command. (Noble Paul via shalin)
|
||
|
||
34. SOLR-850: Addition of timeouts for distributed searching. Configurable through 'shard-socket-timeout' and
|
||
'shard-connection-timeout' parameters in SearchHandler. (Patrick O'Leary via shalin)
|
||
|
||
35. SOLR-799: Add support for hash based exact/near duplicate document
|
||
handling. (Mark Miller, yonik)
|
||
|
||
36. SOLR-1026: Add protected words support to SnowballPorterFilterFactory (ehatcher)
|
||
|
||
37. SOLR-739: Add support for OmitTf (Mark Miller via yonik)
|
||
|
||
38. SOLR-1046: Nested query support for the function query parser
|
||
and lucene query parser (the latter existed as an undocumented
|
||
feature in 1.3) (yonik)
|
||
|
||
39. SOLR-940: Add support for Lucene's Trie Range Queries by providing new FieldTypes in
|
||
schema for int, float, long, double and date. Single-valued Trie based
|
||
fields with a precisionStep will index multiple precisions and enable
|
||
faster range queries. (Uwe Schindler, yonik, shalin)
|
||
|
||
40. SOLR-1038: Enhance CommonsHttpSolrServer to add docs in batch using an iterator API (Noble Paul via shalin)
|
||
|
||
41. SOLR-844: A SolrServer implementation to front-end multiple solr servers and provides load balancing and failover
|
||
support (Noble Paul, Mark Miller, hossman via shalin)
|
||
|
||
42. SOLR-939: ValueSourceRangeFilter/Query - filter based on values in a FieldCache entry or on any arbitrary function of field values. (yonik)
|
||
|
||
43. SOLR-1095: Fixed performance problem in the StopFilterFactory and simplified code. Added tests as well. (gsingers)
|
||
|
||
44. SOLR-1096: Introduced httpConnTimeout and httpReadTimeout in replication slave configuration to avoid stalled
|
||
replication. (Jeff Newburn, Noble Paul, shalin)
|
||
|
||
45. SOLR-1115: <bool>on</bool> and <bool>yes</bool> work as expected in solrconfig.xml. (koji)
|
||
|
||
46. SOLR-1099: A FieldAnalysisRequestHandler which provides the analysis functionality of the web admin page as
|
||
a service. The AnalysisRequestHandler is renamed to DocumentAnalysisRequestHandler which is enhanced with
|
||
query analysis and showMatch support. AnalysisRequestHandler is now deprecated. Support for both
|
||
FieldAnalysisRequestHandler and DocumentAnalysisRequestHandler is also provided in the Solrj client.
|
||
(Uri Boness, shalin)
|
||
|
||
47. SOLR-1106: Made CoreAdminHandler Actions pluggable so that additional actions may be plugged in or the existing
|
||
ones can be overridden if needed. (Kay Kay, Noble Paul, shalin)
|
||
|
||
48. SOLR-1124: Add a top() function query that causes it's argument to
|
||
have it's values derived from the top level IndexReader, even when
|
||
invoked from a sub-reader. top() is implicitly used for the
|
||
ord() and rord() functions. (yonik)
|
||
|
||
49. SOLR-1110: Support sorting on trie fields with Distributed Search. (Mark Miller, Uwe Schindler via shalin)
|
||
|
||
50. SOLR-1121: CoreAdminhandler should not need a core . This makes it possible to start a Solr server w/o a core .(noble)
|
||
|
||
51. SOLR-769: Added support for clustering in contrib/clustering. See http://wiki.apache.org/solr/ClusteringComponent for more info. (gsingers, Stanislaw Osinski)
|
||
|
||
52. SOLR-1175: disable/enable replication on master side. added two commands 'enableReplication' and 'disableReplication' (noble)
|
||
|
||
53. SOLR-1179: DocSets can now be used as Lucene Filters via
|
||
DocSet.getTopFilter() (yonik)
|
||
|
||
54. SOLR-1116: Add a Binary FieldType (noble)
|
||
|
||
55. SOLR-1051: Support the merge of multiple indexes as a CoreAdmin and an update command (Ning Li via shalin)
|
||
|
||
56. SOLR-1152: Snapshoot on ReplicationHandler should accept location as a request parameter (shalin)
|
||
|
||
57. SOLR-1204: Enhance SpellingQueryConverter to handle UTF-8 instead of ASCII only.
|
||
Use the NMTOKEN syntax for matching field names.
|
||
(Michael Ludwig, shalin)
|
||
|
||
58. SOLR-1189: Support providing username and password for basic HTTP authentication in Java replication
|
||
(Matthew Gregg, shalin)
|
||
|
||
59. SOLR-243: Add configurable IndexReaderFactory so that alternate IndexReader implementations
|
||
can be specified via solrconfig.xml. Note that using a custom IndexReader may be incompatible
|
||
with ReplicationHandler (see comments in SOLR-1366). This should be treated as an experimental feature.
|
||
(Andrzej Bialecki, hossman, Mark Miller, John Wang)
|
||
|
||
60. SOLR-1214: differentiate between solr home and instanceDir .deprecates the method SolrResourceLoader#locateInstanceDir()
|
||
and it is renamed to locateSolrHome (noble)
|
||
|
||
61. SOLR-1216 : disambiguate the replication command names. 'snappull' becomes 'fetchindex' 'abortsnappull' becomes 'abortfetch' (noble)
|
||
|
||
62. SOLR-1145: Add capability to specify an infoStream log file for the underlying Lucene IndexWriter in solrconfig.xml.
|
||
This is an advanced debug log file that can be used to aid developers in fixing IndexWriter bugs. See the commented
|
||
out example in the example solrconfig.xml under the indexDefaults section.
|
||
(Chris Harris, Mark Miller)
|
||
|
||
63. SOLR-1256: Show the output of CharFilters in analysis.jsp. (koji)
|
||
|
||
64. SOLR-1266: Added stemEnglishPossessive option (default=true) to WordDelimiterFilter
|
||
that allows disabling of english possessive stemming (removal of trailing 's from tokens)
|
||
(Robert Muir via yonik)
|
||
|
||
65. SOLR-1237: firstSearcher and newSearcher can now be identified via the CommonParams.EVENT (evt) parameter
|
||
in a request. This allows a RequestHandler or SearchComponent to know when a newSearcher or firstSearcher
|
||
event happened. QuerySenderListender is the only implementation in Solr that implements this, but outside
|
||
implementations may wish to. See the AbstractSolrEventListener for a helper method. (gsingers)
|
||
|
||
66. SOLR-1343: Added HTMLStripCharFilter and marked HTMLStripReader, HTMLStripWhitespaceTokenizerFactory and
|
||
HTMLStripStandardTokenizerFactory deprecated. To strip HTML tags, HTMLStripCharFilter can be used
|
||
with an arbitrary Tokenizer. (koji)
|
||
|
||
67. SOLR-1275: Add expungeDeletes to DirectUpdateHandler2 (noble)
|
||
|
||
68. SOLR-1372: Enhance FieldAnalysisRequestHandler to accept field value from content stream (ehatcher)
|
||
|
||
69. SOLR-1370: Show the output of CharFilters in FieldAnalysisRequestHandler (koji)
|
||
|
||
70. SOLR-1373: Add Filter query to admin/form.jsp
|
||
(Jason Rutherglen via hossman)
|
||
|
||
71. SOLR-1368: Add ms() function query for getting milliseconds from dates and for
|
||
high precision date subtraction, add sub() for subtracting other arguments.
|
||
(yonik)
|
||
|
||
72. SOLR-1156: Sort TermsComponent results by frequency (Matt Weber via yonik)
|
||
|
||
73. SOLR-1335 : load core properties from a properties file (noble)
|
||
|
||
74. SOLR-1385 : Add an 'enable' attribute to all plugins (noble)
|
||
|
||
75. SOLR-1414 : implicit core properties are not set for single core (noble)
|
||
|
||
76. SOLR-659 : Adds shards.start and shards.rows to distributed search
|
||
to allow more efficient bulk queries (those that retrieve many or all
|
||
documents). (Brian Whitman via yonik)
|
||
|
||
77. SOLR-1321: Add better support for efficient wildcard handling (Andrzej Bialecki, Robert Muir, gsingers)
|
||
|
||
78. SOLR-1326 : New interface PluginInfoInitialized for all types of plugin (noble)
|
||
|
||
79. SOLR-1447 : Simple property injection. <mergePolicy> & <mergeScheduler> syntaxes are now deprecated
|
||
(Jason Rutherglen, noble)
|
||
|
||
80. SOLR-908 : CommonGramsFilterFactory/CommonGramsQueryFilterFactory for
|
||
speeding up phrase queries containing common words by indexing
|
||
n-grams and using them at query time.
|
||
(Tom Burton-West, Jason Rutherglen via yonik)
|
||
|
||
81. SOLR-1292: Add FieldCache introspection to stats.jsp and JMX Monitoring via
|
||
a new SolrFieldCacheMBean. (hossman)
|
||
|
||
82. SOLR-1167: Solr Config now supports XInclude for XML engines that can support it. (Bryan Talbot via gsingers)
|
||
|
||
83. SOLR-1478: Enable sort by Lucene docid. (ehatcher)
|
||
|
||
84. SOLR-1449: Add <lib> elements to solrconfig.xml to specifying additional
|
||
classpath directories and regular expressions. (hossman via yonik)
|
||
|
||
85. SOLR-1128: Added metadata output to extraction request handler "extract
|
||
only" option. (gsingers)
|
||
|
||
86. SOLR-1274: Added text serialization output for extractOnly
|
||
(Peter Wolanin, gsingers)
|
||
|
||
87. SOLR-768: DIH: Set last_index_time variable in full-import command.
|
||
(Wojtek Piaseczny, Noble Paul via shalin)
|
||
|
||
88. SOLR-811: Allow a "deltaImportQuery" attribute in SqlEntityProcessor
|
||
which is used for delta imports instead of DataImportHandler manipulating
|
||
the SQL itself. (Noble Paul via shalin)
|
||
|
||
89. SOLR-842: Better error handling in DataImportHandler with options to
|
||
abort, skip and continue imports. (Noble Paul, shalin)
|
||
|
||
90. SOLR-833: DIH: A DataSource to read data from a field as a reader. This
|
||
can be used, for example, to read XMLs residing as CLOBs or BLOBs in
|
||
databases. (Noble Paul via shalin)
|
||
|
||
91. SOLR-887: A DIH Transformer to strip HTML tags. (Ahmed Hammad via shalin)
|
||
|
||
92. SOLR-886: DataImportHandler should rollback when an import fails or it is
|
||
aborted (shalin)
|
||
|
||
93. SOLR-891: A DIH Transformer to read strings from Clob type.
|
||
(Noble Paul via shalin)
|
||
|
||
94. SOLR-812: Configurable JDBC settings in JdbcDataSource including optimized
|
||
defaults for read only mode. (David Smiley, Glen Newton, shalin)
|
||
|
||
95. SOLR-910: Add a few utility commands to the DIH admin page such as full
|
||
import, delta import, status, reload config. (Ahmed Hammad via shalin)
|
||
|
||
96. SOLR-938: Add event listener API for DIH import start and end.
|
||
(Kay Kay, Noble Paul via shalin)
|
||
|
||
97. SOLR-801: DIH: Add support for configurable pre-import and post-import
|
||
delete query per root-entity. (Noble Paul via shalin)
|
||
|
||
98. SOLR-988: Add a new scope for session data stored in Context to store
|
||
objects across imports. (Noble Paul via shalin)
|
||
|
||
99. SOLR-980: A PlainTextEntityProcessor which can read from any
|
||
DataSource<Reader> and output a String.
|
||
(Nathan Adams, Noble Paul via shalin)
|
||
|
||
100.SOLR-1003: XPathEntityprocessor must allow slurping all text from a given
|
||
xml node and its children. (Noble Paul via shalin)
|
||
|
||
101.SOLR-1001: Allow variables in various attributes of RegexTransformer,
|
||
HTMLStripTransformer and NumberFormatTransformer.
|
||
(Fergus McMenemie, Noble Paul, shalin)
|
||
|
||
102.SOLR-989: DIH: Expose running statistics from the Context API.
|
||
(Noble Paul, shalin)
|
||
|
||
103.SOLR-996: DIH: Expose Context to Evaluators. (Noble Paul, shalin)
|
||
|
||
104.SOLR-783: DIH: Enhance delta-imports by maintaining separate
|
||
last_index_time for each entity. (Jon Baer, Noble Paul via shalin)
|
||
|
||
105.SOLR-1033: Current entity's namespace is made available to all DIH
|
||
Transformers. This allows one to use an output field of TemplateTransformer
|
||
in other transformers, among other things.
|
||
(Fergus McMenemie, Noble Paul via shalin)
|
||
|
||
106.SOLR-1066: New methods in DIH Context to expose Script details.
|
||
ScriptTransformer changed to read scripts through the new API methods.
|
||
(Noble Paul via shalin)
|
||
|
||
107.SOLR-1062: A DIH LogTransformer which can log data in a given template
|
||
format. (Jon Baer, Noble Paul via shalin)
|
||
|
||
108.SOLR-1065: A DIH ContentStreamDataSource which can accept HTTP POST data
|
||
in a content stream. This can be used to push data to Solr instead of
|
||
just pulling it from DB/Files/URLs. (Noble Paul via shalin)
|
||
|
||
109.SOLR-1061: Improve DIH RegexTransformer to create multiple columns from
|
||
regex groups. (Noble Paul via shalin)
|
||
|
||
110.SOLR-1059: Special DIH flags introduced for deleting documents by query or
|
||
id, skipping rows and stopping further transforms. Use $deleteDocById,
|
||
$deleteDocByQuery for deleting by id and query respectively. Use $skipRow
|
||
to skip the current row but continue with the document. Use $stopTransform
|
||
to stop further transformers. New methods are introduced in Context for
|
||
deleting by id and query. (Noble Paul, Fergus McMenemie, shalin)
|
||
|
||
111.SOLR-1076: JdbcDataSource should resolve DIH variables in all its
|
||
configuration parameters. (shalin)
|
||
|
||
112.SOLR-1055: Make DIH JdbcDataSource easily extensible by making the
|
||
createConnectionFactory method protected and return a
|
||
Callable<Connection> object. (Noble Paul, shalin)
|
||
|
||
113.SOLR-1058: DIH: JdbcDataSource can lookup javax.sql.DataSource using JNDI.
|
||
Use a jndiName attribute to specify the location of the data source.
|
||
(Jason Shepherd, Noble Paul via shalin)
|
||
|
||
114.SOLR-1083: A DIH Evaluator for escaping query characters.
|
||
(Noble Paul, shalin)
|
||
|
||
115.SOLR-934: A MailEntityProcessor to enable indexing mails from
|
||
POP/IMAP sources into a solr index. (Preetam Rao, shalin)
|
||
|
||
116.SOLR-1060: A DIH LineEntityProcessor which can stream lines of text from a
|
||
given file to be indexed directly or for processing with transformers and
|
||
child entities.
|
||
(Fergus McMenemie, Noble Paul, shalin)
|
||
|
||
117.SOLR-1127: Add support for DIH field name to be templatized.
|
||
(Noble Paul, shalin)
|
||
|
||
118.SOLR-1092: Added a new DIH command named 'import' which does not
|
||
automatically clean the index. This is useful and more appropriate when one
|
||
needs to import only some of the entities.
|
||
(Noble Paul via shalin)
|
||
|
||
119.SOLR-1153: DIH 'deltaImportQuery' is honored on child entities as well
|
||
(noble)
|
||
|
||
120.SOLR-1230: Enhanced dataimport.jsp to work with all DataImportHandler
|
||
request handler configurations, rather than just a hardcoded /dataimport
|
||
handler. (ehatcher)
|
||
|
||
121.SOLR-1235: disallow period (.) in DIH entity names (noble)
|
||
|
||
122.SOLR-1234: Multiple DIH does not work because all of them write to
|
||
dataimport.properties. Use the handler name as the properties file name
|
||
(noble)
|
||
|
||
123.SOLR-1348: Support binary field type in convertType logic in DIH
|
||
JdbcDataSource (shalin)
|
||
|
||
124.SOLR-1406: DIH: Make FileDataSource and FileListEntityProcessor to be more
|
||
extensible (Luke Forehand, shalin)
|
||
|
||
125.SOLR-1437: DIH: XPathEntityProcessor can deal with xpath syntaxes such as
|
||
//tagname , /root//tagname (Fergus McMenemie via noble)
|
||
|
||
|
||
Optimizations
|
||
----------------------
|
||
1. SOLR-374: Use IndexReader.reopen to save resources by re-using parts of the
|
||
index that haven't changed. (Mark Miller via yonik)
|
||
|
||
2. SOLR-808: Write string keys in Maps as extern strings in the javabin format. (Noble Paul via shalin)
|
||
|
||
3. SOLR-475: New faceting method with better performance and smaller memory usage for
|
||
multi-valued fields with many unique values but relatively few values per document.
|
||
Controllable via the facet.method parameter - "fc" is the new default method and "enum"
|
||
is the original method. (yonik)
|
||
|
||
4. SOLR-970: Use an ArrayList in SolrPluginUtils.parseQueryStrings
|
||
since we know exactly how long the List will be in advance.
|
||
(Kay Kay via hossman)
|
||
|
||
5. SOLR-1002: Change SolrIndexSearcher to use insertWithOverflow
|
||
with reusable priority queue entries to reduce the amount of
|
||
generated garbage during searching. (Mark Miller via yonik)
|
||
|
||
6. SOLR-971: Replace StringBuffer with StringBuilder for instances that do not require thread-safety.
|
||
(Kay Kay via shalin)
|
||
|
||
7. SOLR-921: SolrResourceLoader must cache short class name vs fully qualified classname
|
||
(Noble Paul, hossman via shalin)
|
||
|
||
8. SOLR-973: CommonsHttpSolrServer writes the xml directly to the server.
|
||
(Noble Paul via shalin)
|
||
|
||
9. SOLR-1108: Remove un-needed synchronization in SolrCore constructor.
|
||
(Noble Paul via shalin)
|
||
|
||
10. SOLR-1166: Speed up docset/filter generation by avoiding top-level
|
||
score() call and iterating over leaf readers with TermDocs. (yonik)
|
||
|
||
11. SOLR-1169: SortedIntDocSet - a new small set implementation
|
||
that saves memory over HashDocSet, is faster to construct,
|
||
is ordered for easier implementation of skipTo, and is faster
|
||
in the general case. (yonik)
|
||
|
||
12. SOLR-1165: Use Lucene Filters and pass them down to the Lucene
|
||
search methods to filter earlier and improve performance. (yonik)
|
||
|
||
13. SOLR-1111: Use per-segment sorting to share fieldcache elements
|
||
across unchanged segments. This saves memory and reduces
|
||
commit times for incremental updates to the index. (yonik)
|
||
|
||
14. SOLR-1188: Minor efficiency improvement in TermVectorComponent related to ignoring positions or offsets (gsingers)
|
||
|
||
15. SOLR-1150: Load Documents for Highlighting one at a time rather than
|
||
all at once to avoid OOM with many large Documents. (Siddharth Gargate via Mark Miller)
|
||
|
||
16. SOLR-1353: Implement and use reusable token streams for analysis. (Robert Muir, yonik)
|
||
|
||
17. SOLR-1296: Enables setting IndexReader's termInfosIndexDivisor via a new attribute to StandardIndexReaderFactory. Enables
|
||
setting termIndexInterval to IndexWriter via SolrIndexConfig. (Jason Rutherglen, hossman, gsingers)
|
||
|
||
18. SOLR-846: DIH: Reduce memory consumption during delta import by removing
|
||
keys when used (Ricky Leung, Noble Paul via shalin)
|
||
|
||
19. SOLR-974: DataImportHandler skips commit if no data has been updated.
|
||
(Wojtek Piaseczny, shalin)
|
||
|
||
20. SOLR-1004: DIH: Check for abort more frequently during delta-imports.
|
||
(Marc Sturlese, shalin)
|
||
|
||
21. SOLR-1098: DIH DateFormatTransformer can cache the format objects.
|
||
(Noble Paul via shalin)
|
||
|
||
22. SOLR-1465: Replaced string concatenations with StringBuilder append
|
||
calls in DIH XPathRecordReader. (Mark Miller, shalin)
|
||
|
||
Bug Fixes
|
||
----------------------
|
||
1. SOLR-774: Fixed logging level display (Sean Timm via Otis Gospodnetic)
|
||
|
||
2. SOLR-771: CoreAdminHandler STATUS should display 'normalized' paths (koji, hossman, shalin)
|
||
|
||
3. SOLR-532: WordDelimiterFilter now respects payloads and other attributes of the original Token by
|
||
using Token.clone() (Tricia Williams, gsingers)
|
||
|
||
4. SOLR-805: DisMax queries are not being cached in QueryResultCache (Todd Feak via koji)
|
||
|
||
5. SOLR-751: WordDelimiterFilter didn't adjust the start offset of single
|
||
tokens that started with delimiters, leading to incorrect highlighting.
|
||
(Stefan Oestreicher via yonik)
|
||
|
||
7. SOLR-843: SynonymFilterFactory cannot handle multiple synonym files correctly (koji)
|
||
|
||
8. SOLR-840: BinaryResponseWriter does not handle incompatible data in fields (Noble Paul via shalin)
|
||
|
||
9. SOLR-803: CoreAdminRequest.createCore fails because name parameter isn't set (Sean Colombo via ryan)
|
||
|
||
10. SOLR-869: Fix file descriptor leak in SolrResourceLoader#getLines (Mark Miller, shalin)
|
||
|
||
11. SOLR-872: Better error message for incorrect copyField destination (Noble Paul via shalin)
|
||
|
||
12. SOLR-879: Enable position increments in the query parser and fix the
|
||
example schema to enable position increments for the stop filter in
|
||
both the index and query analyzers to fix the bug with phrase queries
|
||
with stopwords. (yonik)
|
||
|
||
13. SOLR-836: Add missing "a" to the example stopwords.txt (yonik)
|
||
|
||
14. SOLR-892: Fix serialization of booleans for PHPSerializedResponseWriter
|
||
(yonik)
|
||
|
||
15. SOLR-898: Fix null pointer exception for the JSON response writer
|
||
based formats when nl.json=arrarr with null keys. (yonik)
|
||
|
||
16. SOLR-901: FastOutputStream ignores write(byte[]) call. (Noble Paul via shalin)
|
||
|
||
17. SOLR-807: BinaryResponseWriter writes fieldType.toExternal if it is not a supported type,
|
||
otherwise it writes fieldType.toObject. This fixes the bug with encoding/decoding UUIDField.
|
||
(koji, Noble Paul, shalin)
|
||
|
||
18. SOLR-863: SolrCore.initIndex should close the directory it gets for clearing the lock and
|
||
use the DirectoryFactory. (Mark Miller via shalin)
|
||
|
||
19. SOLR-802: Fix a potential null pointer error in the distributed FacetComponent
|
||
(David Bowen via ryan)
|
||
|
||
20. SOLR-346: Use perl regex to improve accuracy of finding latest snapshot in snapinstaller (billa)
|
||
|
||
21. SOLR-830: Use perl regex to improve accuracy of finding latest snapshot in snappuller (billa)
|
||
|
||
22. SOLR-897: Fixed Argument list too long error when there are lots of snapshots/backups (Dan Rosher via billa)
|
||
|
||
23. SOLR-925: Fixed highlighting on fields with multiValued="true" and termOffsets="true" (koji)
|
||
|
||
24. SOLR-902: FastInputStream#read(byte b[], int off, int len) gives incorrect results when amount left to read is less
|
||
than buffer size (Noble Paul via shalin)
|
||
|
||
25. SOLR-978: Old files are not removed from slaves after replication (Jaco, Noble Paul, shalin)
|
||
|
||
26. SOLR-883: Implicit properties are not set for Cores created through CoreAdmin (Noble Paul via shalin)
|
||
|
||
27. SOLR-991: Better error message when parsing solrconfig.xml fails due to malformed XML. Error message notes the name
|
||
of the file being parsed. (Michael Henson via shalin)
|
||
|
||
28. SOLR-1008: Fix stats.jsp XML encoding for <stat> item entries with ampersands in their names. (ehatcher)
|
||
|
||
29. SOLR-976: deleteByQuery is ignored when deleteById is placed prior to deleteByQuery in a <delete>.
|
||
Now both delete by id and delete by query can be specified at the same time as follows.
|
||
<delete>
|
||
<id>05991</id><id>06000</id>
|
||
<query>office:Bridgewater</query><query>office:Osaka</query>
|
||
</delete>
|
||
(koji)
|
||
|
||
30. SOLR-1016: HTTP 503 error changes 500 in SolrCore (koji)
|
||
|
||
31. SOLR-1015: Incomplete information in replication admin page and http command response when server
|
||
is both master and slave i.e. when server is a repeater (Akshay Ukey via shalin)
|
||
|
||
32. SOLR-1018: Slave is unable to replicate when server acts as repeater (as both master and slave)
|
||
(Akshay Ukey, Noble Paul via shalin)
|
||
|
||
33. SOLR-1031: Fix XSS vulnerability in schema.jsp (Paul Lovvik via ehatcher)
|
||
|
||
34. SOLR-1064: registry.jsp incorrectly displaying info for last core initialized
|
||
regardless of what the current core is. (hossman)
|
||
|
||
35. SOLR-1072: absolute paths used in sharedLib attribute were
|
||
incorrectly treated as relative paths. (hossman)
|
||
|
||
36. SOLR-1104: Fix some rounding errors in LukeRequestHandler's histogram (hossman)
|
||
|
||
37. SOLR-1125: Use query analyzer rather than index analyzer for queryFieldType in QueryElevationComponent
|
||
(koji)
|
||
|
||
38. SOLR-1126: Replicated files have incorrect timestamp (Jian Han Guo, Jeff Newburn, Noble Paul via shalin)
|
||
|
||
39. SOLR-1094: Incorrect value of correctlySpelled attribute in some cases (David Smiley, Mark Miller via shalin)
|
||
|
||
40. SOLR-965: Better error message when <pingQuery> is not configured.
|
||
(Mark Miller via hossman)
|
||
|
||
41. SOLR-1135: Java replication creates Snapshot in the directory where Solr was launched (Jianhan Guo via shalin)
|
||
|
||
42. SOLR-1138: Query Elevation Component now gracefully handles missing queries. (gsingers)
|
||
|
||
43. SOLR-929: LukeRequestHandler should return "dynamicBase" only if the field is dynamic.
|
||
(Peter Wolanin, koji)
|
||
|
||
44. SOLR-1141: NullPointerException during snapshoot command in java based replication (Jian Han Guo, shalin)
|
||
|
||
45. SOLR-1078: Fixes to WordDelimiterFilter to avoid splitting or dropping
|
||
international non-letter characters such as non spacing marks. (yonik)
|
||
|
||
46. SOLR-825, SOLR-1221: Enables highlighting for range/wildcard/fuzzy/prefix queries if using hl.usePhraseHighlighter=true
|
||
and hl.highlightMultiTerm=true. Also make both options default to true. (Mark Miller, yonik)
|
||
|
||
47. SOLR-1174: Fix Logging admin form submit url for multicore. (Jacob Singh via shalin)
|
||
|
||
48. SOLR-1182: Fix bug in OrdFieldSource#equals which could cause a bug with OrdFieldSource caching
|
||
on OrdFieldSource#hashcode collisions. (Mark Miller)
|
||
|
||
49. SOLR-1207: equals method should compare this and other of DocList in DocSetBase (koji)
|
||
|
||
50. SOLR-1242: Human readable JVM info from system handler does integer cutoff rounding, even when dealing
|
||
with GB. Fixed to round to one decimal place. (Jay Hill, Mark Miller)
|
||
|
||
51. SOLR-1243: Admin RequestHandlers should not be cached over HTTP. (Mark Miller)
|
||
|
||
52. SOLR-1260: Fix implementations of set operations for DocList subclasses
|
||
and fix a bug in HashDocSet construction when offset != 0. These bugs
|
||
never manifested in normal Solr use and only potentially affect
|
||
custom code. (yonik)
|
||
|
||
53. SOLR-1171: Fix LukeRequestHandler so it doesn't rely on SolrQueryParser
|
||
and report incorrect stats when field names contain characters
|
||
SolrQueryParser considers special.
|
||
(hossman)
|
||
|
||
54. SOLR-1317: Fix CapitalizationFilterFactory to work when keep parameter is not specified.
|
||
(ehatcher)
|
||
|
||
55. SOLR-1342: CapitalizationFilterFactory uses incorrect term length calculations.
|
||
(Robert Muir via Mark Miller)
|
||
|
||
56. SOLR-1359: DoubleMetaphoneFilter didn't index original tokens if there was no
|
||
alternative, and could incorrectly skip or reorder tokens. (yonik)
|
||
|
||
57. SOLR-1360: Prevent PhoneticFilter from producing duplicate tokens. (yonik)
|
||
|
||
58. SOLR-1371: LukeRequestHandler/schema.jsp errored if schema had no
|
||
uniqueKey field. The new test for this also (hopefully) adds some
|
||
future proofing against similar bugs in the future. As a side
|
||
effect QueryElevationComponentTest was refactored, and a bug in
|
||
that test was found. (hossman)
|
||
|
||
59. SOLR-914: General finalize() improvements. No finalizer delegates
|
||
to the respective close/destroy method w/o first checking if it's
|
||
already been closed/destroyed; if it hasn't a, SEVERE error is
|
||
logged first. (noble, hossman)
|
||
|
||
60. SOLR-1362: WordDelimiterFilter had inconsistent behavior when setting
|
||
the position increment of tokens following a token consisting of all
|
||
delimiters, and could additionally lose big position increments.
|
||
(Robert Muir, yonik)
|
||
|
||
61. SOLR-1091: Jetty's use of CESU-8 for code points outside the BMP
|
||
resulted in invalid output from the serialized PHP writer. (yonik)
|
||
|
||
62. SOLR-1103: LukeRequestHandler (and schema.jsp) have been fixed to
|
||
include the "1" (ie: 2**0) bucket in the term histogram data.
|
||
(hossman)
|
||
|
||
63. SOLR-1398: Add offset corrections in PatternTokenizerFactory.
|
||
(Anders Melchiorsen, koji)
|
||
|
||
64. SOLR-1400: Properly handle zero-length tokens in TrimFilter. This
|
||
was not a bug in any released version. (Peter Wolanin, gsingers)
|
||
|
||
65. SOLR-1071: spellcheck.extendedResults returns an invalid JSON response
|
||
when count > 1. To fix, the extendedResults format was changed.
|
||
(Uri Boness, yonik)
|
||
|
||
66. SOLR-1381: Fixed improper handling of fields that have only term positions and not term offsets during Highlighting (Thorsten Fischer, gsingers)
|
||
|
||
67. SOLR-1427: Fixed registry.jsp issue with MBeans (gsingers)
|
||
|
||
68. SOLR-1468: SolrJ's XML response parsing threw an exception for null
|
||
names, such as those produced when facet.missing=true (yonik)
|
||
|
||
69. SOLR-1471: Fixed issue with calculating missing values for facets in single valued cases in Stats Component.
|
||
This is not correctly calculated for the multivalued case. (James Miller, gsingers)
|
||
|
||
70. SOLR-1481: Fixed omitHeader parameter for PHP ResponseWriter. (Jun Ohtani via billa)
|
||
|
||
71. SOLR-1448: Add weblogic.xml to solr webapp to enable correct operation in
|
||
WebLogic. (Ilan Rabinovitch via yonik)
|
||
|
||
72. SOLR-1504: empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp and co.
|
||
(koji)
|
||
|
||
73. SOLR-1394: HTMLStripCharFilter split tokens that contained entities and
|
||
often calculated offsets incorrectly for entities.
|
||
(Anders Melchiorsen via yonik)
|
||
|
||
74. SOLR-1517: Admin pages could stall waiting for localhost name resolution
|
||
if reverse DNS wasn't configured; this was changed so the DNS resolution
|
||
is attempted only once the first time an admin page is loaded.
|
||
(hossman)
|
||
|
||
75. SOLR-1529: More than 8 deleteByQuery commands in a single request
|
||
caused an error to be returned, although the deletes were
|
||
still executed. (asmodean via yonik)
|
||
|
||
76. SOLR-800: Deep copy collections to avoid ConcurrentModificationException
|
||
in XPathEntityprocessor while streaming
|
||
(Kyle Morrison, Noble Paul via shalin)
|
||
|
||
77. SOLR-823: Request parameter variables ${dataimporter.request.xxx} are not
|
||
resolved in DIH (Mck SembWever, Noble Paul, shalin)
|
||
|
||
78. SOLR-728: Add synchronization to avoid race condition of multiple DIH
|
||
imports working concurrently (Walter Ferrara, shalin)
|
||
|
||
79. SOLR-742: Add ability to create dynamic fields with custom
|
||
DataImportHandler transformers (Wojtek Piaseczny, Noble Paul, shalin)
|
||
|
||
80. SOLR-832: Rows parameter is not honored in DIH non-debug mode and can
|
||
abort a running import in debug mode. (Akshay Ukey, shalin)
|
||
|
||
81. SOLR-838: The DIH VariableResolver obtained from a DataSource's context
|
||
does not have current data. (Noble Paul via shalin)
|
||
|
||
82. SOLR-864: DataImportHandler does not catch and log Errors (shalin)
|
||
|
||
83. SOLR-873: Fix case-sensitive field names and columns (Jon Baer, shalin)
|
||
|
||
84. SOLR-893: Unable to delete documents via SQL and deletedPkQuery with
|
||
deltaimport (Dan Rosher via shalin)
|
||
|
||
85. SOLR-888: DIH DateFormatTransformer cannot convert non-string type
|
||
(Amit Nithian via shalin)
|
||
|
||
86. SOLR-841: DataImportHandler should throw exception if a field does not
|
||
have column attribute (Michael Henson, shalin)
|
||
|
||
87. SOLR-884: CachedSqlEntityProcessor should check if the cache key is
|
||
present in the query results (Noble Paul via shalin)
|
||
|
||
88. SOLR-985: Fix thread-safety issue with DIH TemplateString for concurrent
|
||
imports with multiple cores. (Ryuuichi Kumai via shalin)
|
||
|
||
89. SOLR-999: DIH XPathRecordReader fails on XMLs with nodes mixed with
|
||
CDATA content. (Fergus McMenemie, Noble Paul via shalin)
|
||
|
||
90. SOLR-1000: DIH FileListEntityProcessor should not apply fileName filter to
|
||
directory names. (Fergus McMenemie via shalin)
|
||
|
||
91. SOLR-1009: Repeated column names result in duplicate values.
|
||
(Fergus McMenemie, Noble Paul via shalin)
|
||
|
||
92. SOLR-1017: Fix DIH thread-safety issue with last_index_time for concurrent
|
||
imports in multiple cores due to unsafe usage of SimpleDateFormat by
|
||
multiple threads. (Ryuuichi Kumai via shalin)
|
||
|
||
93. SOLR-1024: Calling abort on DataImportHandler import commits data instead
|
||
of calling rollback. (shalin)
|
||
|
||
94. SOLR-1037: DIH should not add null values in a row returned by
|
||
EntityProcessor to documents. (shalin)
|
||
|
||
95. SOLR-1040: DIH XPathEntityProcessor fails with an xpath like
|
||
/feed/entry/link[@type='text/html']/@href (Noble Paul via shalin)
|
||
|
||
96. SOLR-1042: Fix memory leak in DIH by making TemplateString non-static
|
||
member in VariableResolverImpl (Ryuuichi Kumai via shalin)
|
||
|
||
97. SOLR-1053: IndexOutOfBoundsException in DIH SolrWriter.getResourceAsString
|
||
when size of data-config.xml is a multiple of 1024 bytes.
|
||
(Herb Jiang via shalin)
|
||
|
||
98. SOLR-1077: IndexOutOfBoundsException with useSolrAddSchema in DIH
|
||
XPathEntityProcessor. (Sam Keen, Noble Paul via shalin)
|
||
|
||
99. SOLR-1080: DIH RegexTransformer should not replace if regex is not matched.
|
||
(Noble Paul, Fergus McMenemie via shalin)
|
||
|
||
100.SOLR-1090: DataImportHandler should load the data-config.xml using UTF-8
|
||
encoding. (Rui Pereira, shalin)
|
||
|
||
101.SOLR-1146: ConcurrentModificationException in DataImporter.getStatusMessages
|
||
(Walter Ferrara, Noble Paul via shalin)
|
||
|
||
102.SOLR-1229: Fixes for DIH deletedPkQuery, particularly when using
|
||
transformed Solr unique id's
|
||
(Lance Norskog, Noble Paul via ehatcher)
|
||
|
||
103.SOLR-1286: Fix the IH commit parameter always defaulting to "true" even
|
||
if "false" is explicitly passed in. (Jay Hill, Noble Paul via ehatcher)
|
||
|
||
104.SOLR-1323: Reset XPathEntityProcessor's $hasMore/$nextUrl when fetching
|
||
next URL (noble, ehatcher)
|
||
|
||
105.SOLR-1450: DIH: Jdbc connection properties such as batchSize are not
|
||
applied if the driver jar is placed in solr_home/lib.
|
||
(Steve Sun via shalin)
|
||
|
||
106.SOLR-1474: DIH Delta-import should run even if last_index_time is not set.
|
||
(shalin)
|
||
|
||
|
||
Other Changes
|
||
----------------------
|
||
1. Upgraded to Lucene 2.4.0 (yonik)
|
||
|
||
2. SOLR-805: Upgraded to Lucene 2.9-dev (r707499) (koji)
|
||
|
||
3. DumpRequestHandler (/debug/dump): changed 'fieldName' to 'sourceInfo'. (ehatcher)
|
||
|
||
4. SOLR-852: Refactored common code in CSVRequestHandler and XMLUpdateRequestHandler (gsingers, ehatcher)
|
||
|
||
5. SOLR-871: Removed dependency on stax-utils.jar. If you using solr.jar and running
|
||
java 6, you can also remove woodstox and geronimo. (ryan)
|
||
|
||
6. SOLR-465: Upgraded to Lucene 2.9-dev (r719351) (shalin)
|
||
|
||
7. SOLR-889: Upgraded to commons-io-1.4.jar and commons-fileupload-1.2.1.jar (ryan)
|
||
|
||
8. SOLR-875: Upgraded to Lucene 2.9-dev (r723985) and consolidated the BitSet implementations (Michael Busch, gsingers)
|
||
|
||
9. SOLR-819: Upgraded to Lucene 2.9-dev (r724059) to get access to Arabic public constructors (gsingers)
|
||
|
||
10. SOLR-900: Moved solrj into /src/solrj. The contents of solr-common.jar is now included
|
||
in the solr-solrj.jar. (ryan)
|
||
|
||
11. SOLR-924: Code cleanup: make all existing finalize() methods call
|
||
super.finalize() in a finally block. All current instances extend
|
||
Object, so this doesn't fix any bugs, but helps protect against
|
||
future changes. (Kay Kay via hossman)
|
||
|
||
12. SOLR-885: NamedListCodec is renamed to JavaBinCodec and returns Object instead of NamedList.
|
||
(Noble Paul, yonik via shalin)
|
||
|
||
13. SOLR-84: Use new Solr logo in admin (Michiel via koji)
|
||
|
||
14. SOLR-981: groupId for Woodstox dependency in maven solrj changed to org.codehaus.woodstox (Tim Taranov via shalin)
|
||
|
||
15. Upgraded to Lucene 2.9-dev r738218 (yonik)
|
||
|
||
16. SOLR-959: Refactored TestReplicationHandler to remove hardcoded port numbers (hossman, Akshay Ukey via shalin)
|
||
|
||
17. Upgraded to Lucene 2.9-dev r742220 (yonik)
|
||
|
||
18. SOLR-1022: Better "ignored" field in example schema.xml (Peter Wolanin via hossman)
|
||
|
||
19. SOLR-967: New type-safe constructor for NamedList (Kay Kay via hossman)
|
||
|
||
20. SOLR-1036: Change default QParser from "lucenePlusSort" to "lucene" to
|
||
reduce confusion of semicolon splitting behavior when no sort param is
|
||
specified (hossman)
|
||
|
||
21. Upgraded to Lucene 2.9-dev r752164 (shalin)
|
||
|
||
22. SOLR-1068: Use fsync on replicated index and configuration files (yonik, Noble Paul, shalin)
|
||
|
||
23. SOLR-952: Cleanup duplicated code in deprecated HighlightingUtils (hossman)
|
||
|
||
24. Upgraded to Lucene 2.9-dev r764281 (shalin)
|
||
|
||
25. SOLR-1079: Rename omitTf to omitTermFreqAndPositions (shalin)
|
||
|
||
26. SOLR-804: Added Lucene's misc contrib JAR (rev 764281). (gsingers)
|
||
|
||
27. Upgraded to Lucene 2.9-dev r768228 (shalin)
|
||
|
||
28. Upgraded to Lucene 2.9-dev r768336 (shalin)
|
||
|
||
29. SOLR-997: Wait for a longer time for slave to complete replication in TestReplicationHandler
|
||
(Mark Miller via shalin)
|
||
|
||
30. SOLR-748: FacetComponent helper classes are made public as an experimental API.
|
||
(Wojtek Piaseczny via shalin)
|
||
|
||
31. Upgraded to Lucene 2.9-dev 773862 (Mark Miller)
|
||
|
||
32. Upgraded to Lucene 2.9-dev r776177 (shalin)
|
||
|
||
33. SOLR-1149: Made QParserPlugin and related classes extendible as an experimental API.
|
||
(Kaktu Chakarabati via shalin)
|
||
|
||
34. Upgraded to Lucene 2.9-dev r779312 (yonik)
|
||
|
||
35. SOLR-786: Refactor DisMaxQParser to allow overriding certain features of DisMaxQParser
|
||
(Wojciech Biela via shalin)
|
||
|
||
36. SOLR-458: Add equals and hashCode methods to NamedList (Stefan Rinner, shalin)
|
||
|
||
37. SOLR-1184: Add option in solrconfig to open a new IndexReader rather than
|
||
using reopen. Done mainly as a fail-safe in the case that a user runs into
|
||
a reopen bug/issue. (Mark Miller)
|
||
|
||
38. SOLR-1215 use double quotes to enclose attributes in solr.xml (noble)
|
||
|
||
39. SOLR-1151: add dynamic copy field and maxChars example to example schema.xml.
|
||
(Peter Wolanin, Mark Miller)
|
||
|
||
40. SOLR-1233: remove /select?qt=/whatever restriction on /-prefixed request handlers.
|
||
(ehatcher)
|
||
|
||
41. SOLR-1257: logging.jsp has been removed and now passes through to the
|
||
hierarchical log level tool added in Solr 1.3. Users still
|
||
hitting "/admin/logging.jsp" should switch to "/admin/logging".
|
||
(hossman)
|
||
|
||
42. Upgraded to Lucene 2.9-dev r794238. Other changes include:
|
||
- LUCENE-1614 - Use Lucene's DocIdSetIterator.NO_MORE_DOCS as the sentinel value.
|
||
- LUCENE-1630 - Add acceptsDocsOutOfOrder method to Collector implementations.
|
||
- LUCENE-1673, LUCENE-1701 - Trie has moved to Lucene core and renamed to NumericRangeQuery.
|
||
- LUCENE-1662, LUCENE-1687 - Replace usage of ExtendedFieldCache by FieldCache.
|
||
(shalin)
|
||
|
||
42. SOLR-1241: Solr's CharFilter has been moved to Lucene. Remove CharFilter and related classes
|
||
from Solr and use Lucene's corresponding code (koji via shalin)
|
||
|
||
43. SOLR-1261: Lucene trunk renamed RangeQuery & Co to TermRangeQuery (Uwe Schindler via shalin)
|
||
|
||
44. Upgraded to Lucene 2.9-dev r801856 (Mark Miller)
|
||
|
||
45. SOLR-1276: Added StatsComponentTest (Rafał Kuć, gsingers)
|
||
|
||
46. SOLR-1377: The TokenizerFactory API has changed to explicitly return a Tokenizer
|
||
rather then a TokenStream (that may be or may not be a Tokenizer). This change
|
||
is required to take advantage of the Token reuse improvements in lucene 2.9. (ryan)
|
||
|
||
47. SOLR-1410: Log a warning if the deprecated charset option is used
|
||
on GreekLowerCaseFilterFactory, RussianStemFilterFactory,
|
||
RussianLowerCaseFilterFactory or RussianLetterTokenizerFactory.
|
||
(Robert Muir via hossman)
|
||
|
||
48. SOLR-1423: Due to LUCENE-1906, Solr's tokenizer should use Tokenizer.correctOffset() instead of CharStream.correctOffset().
|
||
(Uwe Schindler via koji)
|
||
|
||
49. SOLR-1319, SOLR-1345: Upgrade Solr Highlighter classes to new Lucene Highlighter API. This upgrade has
|
||
resulted in a back compat break in the DefaultSolrHighlighter class - getQueryScorer is no longer
|
||
protected. If you happened to be overriding that method in custom code, overide getHighlighter instead.
|
||
Also, HighlightingUtils#getQueryScorer has been removed as it was deprecated and backcompat has been
|
||
broken with it anyway. (Mark Miller)
|
||
|
||
50. SOLR-1357 SolrInputDocument cannot process dynamic fields (Lars Grote via noble)
|
||
|
||
51. SOLR-1075: Upgrade to Tika 0.3. See http://www.apache.org/dist/lucene/tika/CHANGES-0.3.txt (gsingers)
|
||
|
||
52. SOLR-1310: Upgrade to Tika 0.4. Note there are some differences in
|
||
detecting Languages now in extracting request handler.
|
||
See http://www.lucidimagination.com/search/document/d6f1899a85b2a45c/vote_apache_tika_0_4_release_candidate_2#d6f1899a85b2a45c
|
||
for discussion on language detection.
|
||
See http://www.apache.org/dist/lucene/tika/CHANGES-0.4.txt. (gsingers)
|
||
|
||
53. SOLR-782: DIH: Refactored SolrWriter to make it a concrete class and
|
||
removed wrappers over SolrInputDocument. Refactored to load Evaluators
|
||
lazily. Removed multiple document nodes in the configuration xml. Removed
|
||
support for 'default' variables, they are automatically available as
|
||
request parameters. (Noble Paul via shalin)
|
||
|
||
54. SOLR-964: DIH: XPathEntityProcessor now ignores DTD validations
|
||
(Fergus McMenemie, Noble Paul via shalin)
|
||
|
||
55. SOLR-1029: DIH: Standardize Evaluator parameter parsing and added helper
|
||
functions for parsing all evaluator parameters in a standard way.
|
||
(Noble Paul, shalin)
|
||
|
||
56. SOLR-1081: Change DIH EventListener to be an interface so that components
|
||
such as an EntityProcessor or a Transformer can act as an event listener.
|
||
(Noble Paul, shalin)
|
||
|
||
57. SOLR-1027: DIH: Alias the 'dataimporter' namespace to a shorter name 'dih'.
|
||
(Noble Paul via shalin)
|
||
|
||
58. SOLR-1084: Better error reporting when DIH entity name is a reserved word
|
||
and data-config.xml root node is not <dataConfig>.
|
||
(Noble Paul via shalin)
|
||
|
||
59. SOLR-1087: Deprecate 'where' attribute in CachedSqlEntityProcessor in
|
||
favor of cacheKey and cacheLookup. (Noble Paul via shalin)
|
||
|
||
60. SOLR-969: Change the FULL_DUMP, DELTA_DUMP, FIND_DELTA constants in DIH
|
||
Context to String. Change Context.currentProcess() to return a string
|
||
instead of an integer. (Kay Kay, Noble Paul, shalin)
|
||
|
||
61. SOLR-1120: Simplified DIH EntityProcessor API by moving logic for applying
|
||
transformers and handling multi-row outputs from Transformers into an
|
||
EntityProcessorWrapper class. The behavior of the method
|
||
EntityProcessor#destroy has been modified to be called once per parent-row
|
||
at the end of row. A new method EntityProcessor#close is added which is
|
||
called at the end of import. A new method
|
||
Context#getResolvedEntityAttribute is added which returns the resolved
|
||
value of an entity's attribute. Introduced a DocWrapper which takes care
|
||
of maintaining document level session variables.
|
||
(Noble Paul, shalin)
|
||
|
||
62. SOLR-1265: Add DIH variable resolving for URLDataSource properties like
|
||
baseUrl. (Chris Eldredge via ehatcher)
|
||
|
||
63. SOLR-1269: Better error messages from DIH JdbcDataSource when JDBC Driver
|
||
name or SQL is incorrect. (ehatcher, shalin)
|
||
|
||
|
||
Build
|
||
----------------------
|
||
1. SOLR-776: Added in ability to sign artifacts via Ant for releases (gsingers)
|
||
|
||
2. SOLR-854: Added run-example target (Mark Miller via ehatcher)
|
||
|
||
3. SOLR-1054:Fix dist-src target for DataImportHandler (Ryuuichi Kumai via shalin)
|
||
|
||
4. SOLR-1219: Added proxy.setup target (koji)
|
||
|
||
5. SOLR-1386: In build.xml, use longfile="gnu" in tar task to avoid warnings about long file names
|
||
(Mark Miller via shalin)
|
||
|
||
6. SOLR-1441: Make it possible to run all tests in a package (shalin)
|
||
|
||
|
||
Documentation
|
||
----------------------
|
||
1. SOLR-789: The javadoc of RandomSortField is not readable (Nicolas Lalevée via koji)
|
||
|
||
2. SOLR-962: Note about null handling in ModifiableSolrParams.add javadoc
|
||
(Kay Kay via hossman)
|
||
|
||
3. SOLR-1409: Added Solr Powered By Logos
|
||
|
||
4. SOLR-1369: Add HSQLDB Jar to example-DIH, unzip database and update
|
||
instructions.
|
||
|
||
|
||
================== Release 1.3.0 ==================
|
||
|
||
Upgrading from Solr 1.2
|
||
-----------------------
|
||
IMPORTANT UPGRADE NOTE: In a master/slave configuration, all searchers/slaves
|
||
should be upgraded before the master! If the master were to be updated
|
||
first, the older searchers would not be able to read the new index format.
|
||
|
||
The Porter snowball based stemmers in Lucene were updated (LUCENE-1142),
|
||
and are not guaranteed to be backward compatible at the index level
|
||
(the stem of certain words may have changed). Re-indexing is recommended.
|
||
|
||
Older Apache Solr installations can be upgraded by replacing
|
||
the relevant war file with the new version. No changes to configuration
|
||
files should be needed.
|
||
|
||
This version of Solr contains a new version of Lucene implementing
|
||
an updated index format. This version of Solr/Lucene can still read
|
||
and update indexes in the older formats, and will convert them to the new
|
||
format on the first index change. Be sure to backup your index before
|
||
upgrading in case you need to downgrade.
|
||
|
||
Solr now recognizes HTTP Request headers related to HTTP Caching (see
|
||
RFC 2616 sec13) and will by default respond with "304 Not Modified"
|
||
when appropriate. This should only affect users who access Solr via
|
||
an HTTP Cache, or via a Web-browser that has an internal cache, but if
|
||
you wish to suppress this behavior an '<httpCaching never304="true"/>'
|
||
option can be added to your solrconfig.xml. See the wiki (or the
|
||
example solrconfig.xml) for more details...
|
||
http://wiki.apache.org/solr/SolrConfigXml#HTTPCaching
|
||
|
||
In Solr 1.2, DateField did not enforce the canonical representation of
|
||
the ISO 8601 format when parsing incoming data, and did not generation
|
||
the canonical format when generating dates from "Date Math" strings
|
||
(particularly as it pertains to milliseconds ending in trailing zeros).
|
||
As a result equivalent dates could not always be compared properly.
|
||
This problem is corrected in Solr 1.3, but DateField users that might
|
||
have been affected by indexing inconsistent formats of equivilent
|
||
dates (ie: 1995-12-31T23:59:59Z vs 1995-12-31T23:59:59.000Z) may want
|
||
to consider reindexing to correct these inconsistencies. Users who
|
||
depend on some of the the "broken" behavior of DateField in Solr 1.2
|
||
(specificly: accepting any input that ends in a 'Z') should consider
|
||
using the LegacyDateField class as a possible alternative. Users that
|
||
desire 100% backwards compatibility should consider using the Solr 1.2
|
||
version of DateField.
|
||
|
||
Due to some changes in the lifecycle of TokenFilterFactories, users of
|
||
Solr 1.2 who have written Java code which constructs new instances of
|
||
StopFilterFactory, SynonymFilterFactory, or EnglishProterFilterFactory
|
||
will need to modify their code by adding a line like the following
|
||
prior to using the factory object...
|
||
factory.inform(SolrCore.getSolrCore().getSolrConfig().getResourceLoader());
|
||
These lifecycle changes do not affect people who use Solr "out of the
|
||
box" or who have developed their own TokenFilterFactory plugins. More
|
||
info can be found in SOLR-594.
|
||
|
||
The python client that used to ship with Solr is no longer included in
|
||
the distribution (see client/python/README.txt).
|
||
|
||
Detailed Change List
|
||
--------------------
|
||
|
||
New Features
|
||
1. SOLR-69: Adding MoreLikeThisHandler to search for similar documents using
|
||
lucene contrib/queries MoreLikeThis. MoreLikeThis is also available from
|
||
the StandardRequestHandler using ?mlt=true. (bdelacretaz, ryan)
|
||
|
||
2. SOLR-253: Adding KeepWordFilter and KeepWordFilterFactory. A TokenFilter
|
||
that keeps tokens with text in the registered keeplist. This behaves like
|
||
the inverse of StopFilter. (ryan)
|
||
|
||
3. SOLR-257: WordDelimiterFilter has a new parameter splitOnCaseChange,
|
||
which can be set to 0 to disable splitting "PowerShot" => "Power" "Shot".
|
||
(klaas)
|
||
|
||
4. SOLR-193: Adding SolrDocument and SolrInputDocument to represent documents
|
||
outside of the lucene Document infrastructure. This class will be used
|
||
by clients and for processing documents. (ryan)
|
||
|
||
5. SOLR-244: Added ModifiableSolrParams - a SolrParams implementation that
|
||
help you change values after initialization. (ryan)
|
||
|
||
6. SOLR-20: Added a java client interface with two implementations. One
|
||
implementation uses commons httpclient to connect to solr via HTTP. The
|
||
other connects to solr directly. Check client/java/solrj. This addition
|
||
also includes tests that start jetty and test a connection using the full
|
||
HTTP request cycle. (Darren Erik Vengroff, Will Johnson, ryan)
|
||
|
||
7. SOLR-133: Added StaxUpdateRequestHandler that uses StAX for XML parsing.
|
||
This implementation has much better error checking and lets you configure
|
||
a custom UpdateRequestProcessor that can selectively process update
|
||
requests depending on the request attributes. This class will likely
|
||
replace XmlUpdateRequestHandler. (Thorsten Scherler, ryan)
|
||
|
||
8. SOLR-264: Added RandomSortField, a utility field with a random sort order.
|
||
The seed is based on a hash of the field name, so a dynamic field
|
||
of this type is useful for generating different random sequences.
|
||
This field type should only be used for sorting or as a value source
|
||
in a FunctionQuery (ryan, hossman, yonik)
|
||
|
||
9. SOLR-266: Adding show=schema to LukeRequestHandler to show the parsed
|
||
schema fields and field types. (ryan)
|
||
|
||
10. SOLR-133: The UpdateRequestHandler now accepts multiple delete options
|
||
within a single request. For example, sending:
|
||
<delete><id>1</id><id>2</id></delete> will delete both 1 and 2. (ryan)
|
||
|
||
11. SOLR-269: Added UpdateRequestProcessor plugin framework. This provides
|
||
a reasonable place to process documents after they are parsed and
|
||
before they are committed to the index. This is a good place for custom
|
||
document manipulation or document based authorization. (yonik, ryan)
|
||
|
||
12. SOLR-260: Converting to a standard PluginLoader framework. This reworks
|
||
RequestHandlers, FieldTypes, and QueryResponseWriters to share the same
|
||
base code for loading and initializing plugins. This adds a new
|
||
configuration option to define the default RequestHandler and
|
||
QueryResponseWriter in XML using default="true". (ryan)
|
||
|
||
13. SOLR-225: Enable pluggable highlighting classes. Allow configurable
|
||
highlighting formatters and Fragmenters. (ryan)
|
||
|
||
14. SOLR-273/376/452/516: Added hl.maxAnalyzedChars highlighting parameter, defaulting
|
||
to 50k, hl.alternateField, which allows the specification of a backup
|
||
field to use as summary if no keywords are matched, and hl.mergeContiguous,
|
||
which combines fragments if they are adjacent in the source document.
|
||
(klaas, Grant Ingersoll, Koji Sekiguchi via klaas)
|
||
|
||
15. SOLR-291: Control maximum number of documents to cache for any entry
|
||
in the queryResultCache via queryResultMaxDocsCached solrconfig.xml
|
||
entry. (Koji Sekiguchi via yonik)
|
||
|
||
16. SOLR-240: New <lockType> configuration setting in <mainIndex> and
|
||
<indexDefaults> blocks supports all Lucene builtin LockFactories.
|
||
'single' is recommended setting, but 'simple' is default for total
|
||
backwards compatibility.
|
||
(Will Johnson via hossman)
|
||
|
||
17. SOLR-248: Added CapitalizationFilterFactory that creates tokens with
|
||
normalized capitalization. This filter is useful for facet display,
|
||
but will not work with a prefix query. (ryan)
|
||
SOLR-468: Change to the semantics to keep the original token, not the
|
||
token in the Map. Also switched to use Lucene's new reusable token
|
||
capabilities. (gsingers)
|
||
|
||
18. SOLR-307: Added NGramFilterFactory and EdgeNGramFilterFactory.
|
||
(Thomas Peuss via Otis Gospodnetic)
|
||
|
||
19. SOLR-305: analysis.jsp can be given a fieldtype instead of a field
|
||
name. (hossman)
|
||
|
||
20. SOLR-102: Added RegexFragmenter, which splits text for highlighting
|
||
based on a given pattern. (klaas)
|
||
|
||
21. SOLR-258: Date Faceting added to SimpleFacets. Facet counts
|
||
computed for ranges of size facet.date.gap (a DateMath expression)
|
||
between facet.date.start and facet.date.end. (hossman)
|
||
|
||
22. SOLR-196: A PHP serialized "phps" response writer that returns a
|
||
serialized array that can be used with the PHP function unserialize,
|
||
and a PHP response writer "php" that may be used by eval.
|
||
(Nick Jenkin, Paul Borgermans, Pieter Berkel via yonik)
|
||
|
||
23. SOLR-308: A new UUIDField class which accepts UUID string values,
|
||
as well as the special value of "NEW" which triggers generation of
|
||
a new random UUID.
|
||
(Thomas Peuss via hossman)
|
||
|
||
24. SOLR-349: New FunctionQuery functions: sum, product, div, pow, log,
|
||
sqrt, abs, scale, map. Constants may now be used as a value source.
|
||
(yonik)
|
||
|
||
25. SOLR-359: Add field type className to Luke response, and enabled access
|
||
to the detailed field information from the solrj client API.
|
||
(Grant Ingersoll via ehatcher)
|
||
|
||
26. SOLR-334: Pluggable query parsers. Allows specification of query
|
||
type and arguments as a prefix on a query string. (yonik)
|
||
|
||
27. SOLR-351: External Value Source. An external file may be used
|
||
to specify the values of a field, currently usable as
|
||
a ValueSource in a FunctionQuery. (yonik)
|
||
|
||
28. SOLR-395: Many new features for the spell checker implementation, including
|
||
an extended response mode with much richer output, multi-word spell checking,
|
||
and a bevy of new and renamed options (see the wiki).
|
||
(Mike Krimerman, Scott Taber via klaas).
|
||
|
||
29. SOLR-408: Added PingRequestHandler and deprecated SolrCore.getPingQueryRequest().
|
||
Ping requests should be configured using standard RequestHandler syntax in
|
||
solrconfig.xml rather then using the <pingQuery></pingQuery> syntax.
|
||
(Karsten Sperling via ryan)
|
||
|
||
30. SOLR-281: Added a 'Search Component' interface and converted StandardRequestHandler
|
||
and DisMaxRequestHandler to use this framework.
|
||
(Sharad Agarwal, Henri Biestro, yonik, ryan)
|
||
|
||
31. SOLR-176: Add detailed timing data to query response output. The SearchHandler
|
||
interface now returns how long each section takes. (klaas)
|
||
|
||
32. SOLR-414: Plugin initialization now supports SolrCore and ResourceLoader "Aware"
|
||
plugins. Plugins that implement SolrCoreAware or ResourceLoaderAware are
|
||
informed about the SolrCore/ResourceLoader. (Henri Biestro, ryan)
|
||
|
||
33. SOLR-350: Support multiple SolrCores running in the same solr instance and allows
|
||
runtime runtime management for any running SolrCore. If a solr.xml file exists
|
||
in solr.home, this file is used to instanciate multiple cores and enables runtime
|
||
core manipulation. For more informaion see: http://wiki.apache.org/solr/CoreAdmin
|
||
(Henri Biestro, ryan)
|
||
|
||
34. SOLR-447: Added an single request handler that will automatically register all
|
||
standard admin request handlers. This replaces the need to register (and maintain)
|
||
the set of admin request handlers. Assuming solrconfig.xml includes:
|
||
<requestHandler name="/admin/" class="org.apache.solr.handler.admin.AdminHandlers" />
|
||
This will register: Luke/SystemInfo/PluginInfo/ThreadDump/PropertiesRequestHandler.
|
||
(ryan)
|
||
|
||
35. SOLR-142: Added RawResponseWriter and ShowFileRequestHandler. This returns config
|
||
files directly. If AdminHandlers are configured, this will be added automatically.
|
||
The jsp files /admin/get-file.jsp and /admin/raw-schema.jsp have been deprecated.
|
||
The deprecated <admin><gettableFiles> will be automatically registered with
|
||
a ShowFileRequestHandler instance for backwards compatibility. (ryan)
|
||
|
||
36. SOLR-446: TextResponseWriter can write SolrDocuments and SolrDocumentLists the
|
||
same way it writes Document and DocList. (yonik, ryan)
|
||
|
||
37. SOLR-418: Adding a query elevation component. This is an optional component to
|
||
elevate some documents to the top positions (or exclude them) for a given query.
|
||
(ryan)
|
||
|
||
38. SOLR-478: Added ability to get back unique key information from the LukeRequestHandler.
|
||
(gsingers)
|
||
|
||
39. SOLR-127: HTTP Caching awareness. Solr now recognizes HTTP Request
|
||
headers related to HTTP Caching (see RFC 2616 sec13) and will respond
|
||
with "304 Not Modified" when appropriate. New options have been added
|
||
to solrconfig.xml to influence this behavior.
|
||
(Thomas Peuss via hossman)
|
||
|
||
40. SOLR-303: Distributed Search over HTTP. Specification of shards
|
||
argument causes Solr to query those shards and merge the results
|
||
into a single response. Querying, field faceting (sorted only),
|
||
query faceting, highlighting, and debug information are supported
|
||
in distributed mode.
|
||
(Sharad Agarwal, Patrick O'Leary, Sabyasachi Dalal, Stu Hood,
|
||
Jayson Minard, Lars Kotthoff, ryan, yonik)
|
||
|
||
41. SOLR-356: Pluggable functions (value sources) that allow
|
||
registration of new functions via solrconfig.xml
|
||
(Doug Daniels via yonik)
|
||
|
||
42. SOLR-494: Added cool admin Ajaxed schema explorer.
|
||
(Greg Ludington via ehatcher)
|
||
|
||
43. SOLR-497: Added date faceting to the QueryResponse in SolrJ
|
||
and QueryResponseTest (Shalin Shekhar Mangar via gsingers)
|
||
|
||
44. SOLR-486: Binary response format, faster and smaller
|
||
than XML and JSON response formats (use wt=javabin).
|
||
BinaryResponseParser for utilizing the binary format via SolrJ
|
||
and is now the default.
|
||
(Noble Paul, yonik)
|
||
|
||
45. SOLR-521: StopFilterFactory support for "enablePositionIncrements"
|
||
(Walter Ferrara via hossman)
|
||
|
||
46. SOLR-557: Added SolrCore.getSearchComponents() to return an unmodifiable Map. (gsingers)
|
||
|
||
47. SOLR-516: Added hl.maxAlternateFieldLength parameter, to set max length for hl.alternateField
|
||
(Koji Sekiguchi via klaas)
|
||
|
||
48. SOLR-319: Changed SynonymFilterFactory to "tokenize" synonyms file.
|
||
To use a tokenizer, specify "tokenizerFactory" attribute in <filter>.
|
||
For example:
|
||
<tokenizer class="solr.CJKTokenizerFactory"/>
|
||
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" expand="true"
|
||
ignoreCase="true" tokenizerFactory="solr.CJKTokenizerFactory"/>
|
||
(koji)
|
||
|
||
49. SOLR-515: Added SimilarityFactory capability to schema.xml,
|
||
making config file parameters usable in the construction of
|
||
the global Lucene Similarity implementation.
|
||
(ehatcher)
|
||
|
||
50. SOLR-536: Add a DocumentObjectBinder to solrj that converts Objects to and
|
||
from SolrDocuments. (Noble Paul via ryan)
|
||
|
||
51. SOLR-595: Add support for Field level boosting in the MoreLikeThis Handler.
|
||
(Tom Morton, gsingers)
|
||
|
||
52. SOLR-572: Added SpellCheckComponent and org.apache.solr.spelling package to support more spell
|
||
checking functionality. Also includes ability to add your own SolrSpellChecker implementation that
|
||
plugs in. See http://wiki.apache.org/solr/SpellCheckComponent for more details
|
||
(Shalin Shekhar Mangar, Bojan Smid, gsingers)
|
||
|
||
53. SOLR-679: Added accessor methods to Lucene based spell checkers (gsingers)
|
||
|
||
54. SOLR-423: Added Request Handler close hook notification so that RequestHandlers can be notified
|
||
when a core is closing. (gsingers, ryan)
|
||
|
||
55. SOLR-603: Added ability to partially optimize. (gsingers)
|
||
|
||
56. SOLR-483: Add byte/short sorting support (gsingers)
|
||
|
||
57. SOLR-14: Add preserveOriginal flag to WordDelimiterFilter
|
||
(Geoffrey Young, Trey Hyde, Ankur Madnani, yonik)
|
||
|
||
58. SOLR-502: Add search timeout support. (Sean Timm via yonik)
|
||
|
||
59. SOLR-605: Add the ability to register callbacks programatically (ryan, Noble Paul)
|
||
|
||
60. SOLR-610: hl.maxAnalyzedChars can be -1 to highlight everything (Lars Kotthoff via klaas)
|
||
|
||
61. SOLR-522: Make analysis.jsp show payloads. (Tricia Williams via yonik)
|
||
|
||
62. SOLR-611: Expose sort_values returned by QueryComponent in SolrJ's QueryResponse
|
||
(Dan Rosher via shalin)
|
||
|
||
63. SOLR-256: Support exposing Solr statistics through JMX (Sharad Agrawal, shalin)
|
||
|
||
64. SOLR-666: Expose warmup time in statistics for SolrIndexSearcher and LRUCache (shalin)
|
||
|
||
65. SOLR-663: Allow multiple files for stopwords, keepwords, protwords and synonyms
|
||
(Otis Gospodnetic, shalin)
|
||
|
||
66. SOLR-469: Added DataImportHandler as a contrib project which makes indexing data from Databases,
|
||
XML files and HTTP data sources into Solr quick and easy. Includes API and implementations for
|
||
supporting multiple data sources, processors and transformers for importing data. Supports full
|
||
data imports as well as incremental (delta) indexing. See http://wiki.apache.org/solr/DataImportHandler
|
||
for more details. (Noble Paul, shalin)
|
||
|
||
67. SOLR-622: SpellCheckComponent supports auto-loading indices on startup and optionally, (re)builds
|
||
indices on newSearcher event, if configured in solrconfig.xml (shalin)
|
||
|
||
68. SOLR-554: Hierarchical JDK log level selector for SOLR Admin replaces logging.jsp
|
||
(Sean Timm via shalin)
|
||
|
||
69. SOLR-506: Emitting HTTP Cache headers can be enabled or disabled through configuration on a
|
||
per-handler basis (shalin)
|
||
|
||
70. SOLR-716: Added support for properties in configuration files. Properties can be specified in
|
||
solr.xml and can be used in solrconfig.xml and schema.xml (Henri Biestro, hossman, ryan, shalin)
|
||
|
||
71. SOLR-1129 : Support binding dynamic fields to beans in SolrJ (Avlesh Singh , noble)
|
||
|
||
72. SOLR-920 : Cache and reuse IndexSchema . A new attribute added in solr.xml called 'shareSchema' (noble)
|
||
|
||
73. SOLR-700: DIH: Allow configurable locales through a locale attribute in
|
||
fields for NumberFormatTransformer. (Stefan Oestreicher, shalin)
|
||
|
||
Changes in runtime behavior
|
||
1. SOLR-559: use Lucene updateDocument, deleteDocuments methods. This
|
||
removes the maxBufferedDeletes parameter added by SOLR-310 as Lucene
|
||
now manages the deletes. This provides slightly better indexing
|
||
performance and makes overwrites atomic, eliminating the possibility of
|
||
a crash causing duplicates. (yonik)
|
||
|
||
2. SOLR-689 / SOLR-695: If you have used "MultiCore" functionality in an unreleased
|
||
version of 1.3-dev, many classes and configs have been renamed for the official
|
||
1.3 release. Speciffically, solr.xml has replaced multicore.xml, and uses a slightly
|
||
different syntax. The solrj classes: MultiCore{Request/Response/Params} have been
|
||
renamed: CoreAdmin{Request/Response/Params} (hossman, ryan, Henri Biestro)
|
||
|
||
3. SOLR-647: reference count the SolrCore uses to prevent a premature
|
||
close while a core is still in use. (Henri Biestro, Noble Paul, yonik)
|
||
|
||
4. SOLR-737: SolrQueryParser now uses a ConstantScoreQuery for wildcard
|
||
queries that prevent an exception from being thrown when the number
|
||
of matching terms exceeds the BooleanQuery clause limit. (yonik)
|
||
|
||
Optimizations
|
||
1. SOLR-276: improve JSON writer speed. (yonik)
|
||
|
||
2. SOLR-310: bound and reduce memory usage by providing <maxBufferedDeletes> parameter,
|
||
which flushes deleted without forcing the user to use <commit/> for this purpose.
|
||
(klaas)
|
||
|
||
3. SOLR-348: short-circuit faceting if less than mincount docs match. (yonik)
|
||
|
||
4. SOLR-354: Optimize removing all documents. Now when a delete by query
|
||
of *:* is issued, the current index is removed. (yonik)
|
||
|
||
5. SOLR-377: Speed up response writers. (yonik)
|
||
|
||
6. SOLR-342: Added support into the SolrIndexWriter for using several new features of the new
|
||
LuceneIndexWriter, including: setRAMBufferSizeMB(), setMergePolicy(), setMergeScheduler.
|
||
Also, added support to specify Lucene's autoCommit functionality (not to be confused with Solr's
|
||
similarily named autoCommit functionality) via the <luceneAutoCommit> config. item. See the test
|
||
and example solrconfig.xml <indexDefaults> section for usage. Performance during indexing should
|
||
be significantly increased by moving up to 2.3 due to Lucene's new indexing capabilities.
|
||
Furthermore, the setRAMBufferSizeMB makes it more logical to decide on tuning factors related to
|
||
indexing. For best performance, leave the mergePolicy and mergeScheduler as the defaults and set
|
||
ramBufferSizeMB instead of maxBufferedDocs. The best value for this depends on the types of
|
||
documents in use. 32 should be a good starting point, but reports have shown up to 48 MB provides
|
||
good results. Note, it is acceptable to set both ramBufferSizeMB and maxBufferedDocs, and Lucene
|
||
will flush based on whichever limit is reached first. (gsingers)
|
||
|
||
7. SOLR-330: Converted TokenStreams to use Lucene's new char array based
|
||
capabilities. (gsingers)
|
||
|
||
8. SOLR-624: Only take snapshots if there are differences to the index (Richard Trey Hyde via gsingers)
|
||
|
||
9. SOLR-587: Delete by Query performance greatly improved by using
|
||
new underlying Lucene IndexWriter implementation. (yonik)
|
||
|
||
10. SOLR-730: Use read-only IndexReaders that don't synchronize
|
||
isDeleted(). This will speed up function queries and *:* queries
|
||
as well as improve their scalability on multi-CPU systems.
|
||
(Mark Miller via yonik)
|
||
|
||
Bug Fixes
|
||
1. Make TextField respect sortMissingFirst and sortMissingLast fields.
|
||
(J.J. Larrea via yonik)
|
||
|
||
2. autoCommit/maxDocs was not working properly when large autoCommit/maxTime
|
||
was specified (klaas)
|
||
|
||
3. SOLR-283: autoCommit was not working after delete. (ryan)
|
||
|
||
4. SOLR-286: ContentStreamBase was not using default encoding for getBytes()
|
||
(Toru Matsuzawa via ryan)
|
||
|
||
5. SOLR-292: Fix MoreLikeThis facet counting. (Pieter Berkel via ryan)
|
||
|
||
6. SOLR-297: Fix bug in RequiredSolrParams where requiring a field
|
||
specific param would fail if a general default value had been supplied.
|
||
(hossman)
|
||
|
||
7. SOLR-331: Fix WordDelimiterFilter handling of offsets for synonyms or
|
||
other injected tokens that can break highlighting. (yonik)
|
||
|
||
8. SOLR-282: Snapshooter does not work on Solaris and OS X since the cp command
|
||
there does not have the -l option. Also updated commit/optimize related
|
||
scripts to handle both old and new response format. (bill)
|
||
|
||
9. SOLR-294: Logging of elapsed time broken on Solaris because the date command
|
||
there does not support the %s output format. (bill)
|
||
|
||
10. SOLR-136: Snappuller - "date -d" and locales don't mix. (Jürgen Hermann via bill)
|
||
|
||
11. SOLR-333: Changed distributiondump.jsp to use Solr HOME instead of CWD to set path.
|
||
|
||
12. SOLR-393: Removed duplicate contentType from raw-schema.jsp. (bill)
|
||
|
||
13. SOLR-413: Requesting a large numbers of documents to be returned (limit)
|
||
can result in an out-of-memory exception, even for a small index. (yonik)
|
||
|
||
14. The CSV loader incorrectly threw an exception when given
|
||
header=true (the default). (ryan, yonik)
|
||
|
||
15. SOLR-449: the python and ruby response writers are now able to correctly
|
||
output NaN and Infinity in their respective languages. (klaas)
|
||
|
||
16. SOLR-42: HTMLStripReader tokenizers now preserve correct source
|
||
offsets for highlighting. (Grant Ingersoll via yonik)
|
||
|
||
17. SOLR-481: Handle UnknownHostException in _info.jsp (gsingers)
|
||
|
||
18. SOLR-324: Add proper support for Long and Doubles in sorting, etc. (gsingers)
|
||
|
||
19. SOLR-496: Cache-Control max-age changed to Long so Expires
|
||
calculation won't cause overflow. (Thomas Peuss via hossman)
|
||
|
||
20. SOLR-535: Fixed typo (Tokenzied -> Tokenized) in schema.jsp (Thomas Peuss via billa)
|
||
|
||
21. SOLR-529: Better error messages from SolrQueryParser when field isn't
|
||
specified and there is no defaultSearchField in schema.xml
|
||
(Lars Kotthoff via hossman)
|
||
|
||
22. SOLR-530: Better error messages/warnings when parsing schema.xml:
|
||
field using bogus fieldtype and multiple copyFields to a non-multiValue
|
||
field. (Shalin Shekhar Mangar via hossman)
|
||
|
||
23. SOLR-528: Better error message when defaultSearchField is bogus or not
|
||
indexed. (Lars Kotthoff via hossman)
|
||
|
||
24. SOLR-533: Fixed tests so they don't use hardcoded port numbers.
|
||
(hossman)
|
||
|
||
25. SOLR-400: SolrExceptionTest should now handle using OpenDNS as a DNS provider (gsingers)
|
||
|
||
26. SOLR-541: Legacy XML update support (provided by SolrUpdateServlet
|
||
when no RequestHandler is mapped to "/update") now logs error correctly.
|
||
(hossman)
|
||
|
||
27. SOLR-267: Changed logging to report number of hits, and also provide a mechanism to add log
|
||
messages to be output by the SolrCore via a NamedList toLog member variable.
|
||
(Will Johnson, yseeley, gsingers)
|
||
|
||
- SOLR-267: Removed adding values to the HTTP headers in SolrDispatchFilter (gsingers)
|
||
|
||
28. SOLR-509: Moved firstSearcher event notification to the end of the SolrCore constructor
|
||
(Koji Sekiguchi via gsingers)
|
||
|
||
29. SOLR-470, SOLR-552, SOLR-544, SOLR-701: Multiple fixes to DateField
|
||
regarding lenient parsing of optional milliseconds, and correct
|
||
formating using the canonical representation. LegacyDateField has
|
||
been added for people who have come to depend on the existing
|
||
broken behavior. (hossman, Stefan Oestreicher)
|
||
|
||
30. SOLR-539: Fix for non-atomic long counters and a cast fix to avoid divide
|
||
by zero. (Sean Timm via Otis Gospodnetic)
|
||
|
||
31. SOLR-514: Added explicit media-type with UTF* charset to *.xsl files that
|
||
don't already have one. (hossman)
|
||
|
||
32. SOLR-505: Give RequestHandlers the possiblity to suppress the generation
|
||
of HTTP caching headers. (Thomas Peuss via Otis Gospodnetic)
|
||
|
||
33. SOLR-553: Handle highlighting of phrase terms better when
|
||
hl.usePhraseHighligher=true URL param is used.
|
||
(Bojan Smid via Otis Gospodnetic)
|
||
|
||
34. SOLR-590: Limitation in pgrep on Linux platform breaks script-utils fixUser.
|
||
(Hannes Schmidt via billa)
|
||
|
||
35. SOLR-597: SolrServlet no longer "caches" SolrCore. This was causing
|
||
problems in Resin, and could potentially cause problems for customized
|
||
usages of SolrServlet.
|
||
|
||
36. SOLR-585: Now sets the QParser on the ResponseBuilder (gsingers)
|
||
|
||
37. SOLR-604: If the spellchecking path is relative, make it relative to the Solr Data Directory.
|
||
(Shalin Shekhar Mangar via gsingers)
|
||
|
||
38. SOLR-584: Make stats.jsp and stats.xsl more robust.
|
||
(Yousef Ourabi and hossman)
|
||
|
||
39. SOLR-443: SolrJ: Declare UTF-8 charset on POSTed parameters
|
||
to avoid problems with servlet containers that default to latin-1
|
||
and allow switching of the exact POST mechanism for parameters
|
||
via useMultiPartPost in CommonsHttpSolrServer.
|
||
(Lars Kotthoff, Andrew Schurman, ryan, yonik)
|
||
|
||
40. SOLR-556: multi-valued fields always highlighted in disparate snippets
|
||
(Lars Kotthoff via klaas)
|
||
|
||
41. SOLR-501: Fix admin/analysis.jsp UTF-8 input for some other servlet
|
||
containers such as Tomcat. (Hiroaki Kawai, Lars Kotthoff via yonik)
|
||
|
||
42. SOLR-616: SpellChecker accuracy configuration is not applied for FileBasedSpellChecker.
|
||
Apply it for FileBasedSpellChecker and IndexBasedSpellChecker both.
|
||
(shalin)
|
||
|
||
43. SOLR-648: SpellCheckComponent throws NullPointerException on using spellcheck.q request
|
||
parameter after restarting Solr, if reload is called but build is not called.
|
||
(Jonathan Lee, shalin)
|
||
|
||
44. SOLR-598: DebugComponent now always occurs last in the SearchHandler list unless the
|
||
components are explicitly declared. (gsingers)
|
||
|
||
45. SOLR-676: DataImportHandler should use UpdateRequestProcessor API instead of directly
|
||
using UpdateHandler. (shalin)
|
||
|
||
46. SOLR-696: Fixed bug in NamedListCodec in regards to serializing Iterable objects. (gsingers)
|
||
|
||
47. SOLR-669: snappuler fix for FreeBSD/Darwin (Richard "Trey" Hyde via Otis Gospodnetic)
|
||
|
||
48. SOLR-606: Fixed spell check collation offset issue. (Stefan Oestreicher , Geoffrey Young, gsingers)
|
||
|
||
49. SOLR-589: Improved handling of badly formated query strings (Sean Timm via Otis Gospodnetic)
|
||
|
||
50. SOLR-749: Allow QParser and ValueSourceParsers to be extended with same name (hossman, gsingers)
|
||
|
||
51. SOLR-704: DIH NumberFormatTransformer can silently ignore part of the
|
||
string while parsing. Now it tries to use the complete string for parsing.
|
||
Failure to do so will result in an exception.
|
||
(Stefan Oestreicher via shalin)
|
||
|
||
52. SOLR-729: DIH Context.getDataSource(String) gives current entity's
|
||
DataSource instance regardless of argument. (Noble Paul, shalin)
|
||
|
||
53. SOLR-726: DIH: Jdbc Drivers and DataSources fail to load if placed in
|
||
multicore sharedLib or core's lib directory.
|
||
(Walter Ferrara, Noble Paul, shalin)
|
||
|
||
Other Changes
|
||
1. SOLR-135: Moved common classes to org.apache.solr.common and altered the
|
||
build scripts to make two jars: apache-solr-1.3.jar and
|
||
apache-solr-1.3-common.jar. This common.jar can be used in client code;
|
||
It does not have lucene or junit dependencies. The original classes
|
||
have been replaced with a @Deprecated extended class and are scheduled
|
||
to be removed in a later release. While this change does not affect API
|
||
compatibility, it is recommended to update references to these
|
||
deprecated classes. (ryan)
|
||
|
||
2. SOLR-268: Tweaks to post.jar so it prints the error message from Solr.
|
||
(Brian Whitman via hossman)
|
||
|
||
3. Upgraded to Lucene 2.2.0; June 18, 2007.
|
||
|
||
4. SOLR-215: Static access to SolrCore.getSolrCore() and SolrConfig.config
|
||
have been deprecated in order to support multiple loaded cores.
|
||
(Henri Biestro via ryan)
|
||
|
||
5. SOLR-367: The create method in all TokenFilter and Tokenizer Factories
|
||
provided by Solr now declare their specific return types instead of just
|
||
using "TokenStream" (hossman)
|
||
|
||
6. SOLR-396: Hooks add to build system for automatic generation of (stub)
|
||
Tokenizer and TokenFilter Factories.
|
||
Also: new Factories for all Tokenizers and TokenFilters provided by the
|
||
lucene-analyzers-2.2.0.jar -- includes support for German, Chinese,
|
||
Russan, Dutch, Greek, Brazilian, Thai, and French. (hossman)
|
||
|
||
7. Upgraded to commons-CSV r609327, which fixes escaping bugs and
|
||
introduces new escaping and whitespace handling options to
|
||
increase compatibility with different formats. (yonik)
|
||
|
||
8. Upgraded to Lucene 2.3.0; Jan 23, 2008.
|
||
|
||
9. SOLR-451: Changed analysis.jsp to use POST instead of GET, also made the input area a
|
||
bit bigger (gsingers)
|
||
|
||
10. Upgrade to Lucene 2.3.1
|
||
|
||
11. SOLR-531: Different exit code for rsyncd-start and snappuller if disabled (Thomas Peuss via billa)
|
||
|
||
12. SOLR-550: Clarified DocumentBuilder addField javadocs (gsingers)
|
||
|
||
13. Upgrade to Lucene 2.3.2
|
||
|
||
14. SOLR-518: Changed luke.xsl to use divs w/css for generating histograms
|
||
instead of SVG (Thomas Peuss via hossman)
|
||
|
||
15. SOLR-592: Added ShardParams interface and changed several string literals
|
||
to references to constants in CommonParams.
|
||
(Lars Kotthoff via Otis Gospodnetic)
|
||
|
||
16. SOLR-520: Deprecated unused LengthFilter since already core in
|
||
Lucene-Java (hossman)
|
||
|
||
17. SOLR-645: Refactored SimpleFacetsTest (Lars Kotthoff via hossman)
|
||
|
||
18. SOLR-591: Changed Solrj default value for facet.sort to true (Lars Kotthoff via Shalin)
|
||
|
||
19. Upgraded to Lucene 2.4-dev (r669476) to support SOLR-572 (gsingers)
|
||
|
||
20. SOLR-636: Improve/simplify example configs; and make index.jsp
|
||
links more resilient to configs loaded via an InputStream
|
||
(Lars Kotthoff, hossman)
|
||
|
||
21. SOLR-682: Scripts now support FreeBSD (Richard Trey Hyde via gsingers)
|
||
|
||
22. SOLR-489: Added in deprecation comments. (Sean Timm, Lars Kothoff via gsingers)
|
||
|
||
23. SOLR-692: Migrated to stable released builds of StAX API 1.0.1 and StAX 1.2.0 (shalin)
|
||
24. Upgraded to Lucene 2.4-dev (r686801) (yonik)
|
||
25. Upgraded to Lucene 2.4-dev (r688745) 27-Aug-2008 (yonik)
|
||
26. Upgraded to Lucene 2.4-dev (r691741) 03-Sep-2008 (yonik)
|
||
27. Replaced the StAX reference implementation with the geronimo
|
||
StAX API jar, and the Woodstox StAX implementation. (yonik)
|
||
|
||
Build
|
||
1. SOLR-411. Changed the names of the Solr JARs to use the defacto standard JAR names based on
|
||
project-name-version.jar. This yields, for example:
|
||
apache-solr-common-1.3-dev.jar
|
||
apache-solr-solrj-1.3-dev.jar
|
||
apache-solr-1.3-dev.jar
|
||
|
||
2. SOLR-479: Added clover code coverage targets for committers and the nightly build. Requires
|
||
the Clover library, as licensed to Apache and only available privately. To run:
|
||
ant -Drun.clover=true clean clover test generate-clover-reports
|
||
|
||
3. SOLR-510: Nightly release includes client sources. (koji)
|
||
|
||
4. SOLR-563: Modified the build process to build contrib projects
|
||
(Shalin Shekhar Mangar via Otis Gospodnetic)
|
||
|
||
5. SOLR-673: Modify build file to create javadocs for core, solrj, contrib and "all inclusive" (shalin)
|
||
|
||
6. SOLR-672: Nightly release includes contrib sources. (Jeremy Hinegardner, shalin)
|
||
|
||
7. SOLR-586: Added ant target and POM files for building maven artifacts of the Solr core, common,
|
||
client and contrib. The target can publish artifacts with source and javadocs.
|
||
(Spencer Crissman, Craig McClanahan, shalin)
|
||
|
||
================== Release 1.2 ==================
|
||
|
||
Upgrading from Solr 1.1
|
||
-------------------------------------
|
||
IMPORTANT UPGRADE NOTE: In a master/slave configuration, all searchers/slaves
|
||
should be upgraded before the master! If the master were to be updated
|
||
first, the older searchers would not be able to read the new index format.
|
||
|
||
Older Apache Solr installations can be upgraded by replacing
|
||
the relevant war file with the new version. No changes to configuration
|
||
files should be needed.
|
||
|
||
This version of Solr contains a new version of Lucene implementing
|
||
an updated index format. This version of Solr/Lucene can still read
|
||
and update indexes in the older formats, and will convert them to the new
|
||
format on the first index change. One change in the new index format
|
||
is that all "norms" are kept in a single file, greatly reducing the number
|
||
of files per segment. Users of compound file indexes will want to consider
|
||
converting to the non-compound format for faster indexing and slightly better
|
||
search concurrency.
|
||
|
||
The JSON response format for facets has changed to make it easier for
|
||
clients to retain sorted order. Use json.nl=map explicitly in clients
|
||
to get the old behavior, or add it as a default to the request handler
|
||
in solrconfig.xml
|
||
|
||
The Lucene based Solr query syntax is slightly more strict.
|
||
A ':' in a field value must be escaped or the whole value must be quoted.
|
||
|
||
The Solr "Request Handler" framework has been updated in two key ways:
|
||
First, if a Request Handler is registered in solrconfig.xml with a name
|
||
starting with "/" then it can be accessed using path-based URL, instead of
|
||
using the legacy "/select?qt=name" URL structure. Second, the Request
|
||
Handler framework has been extended making it possible to write Request
|
||
Handlers that process streams of data for doing updates, and there is a
|
||
new-style Request Handler for XML updates given the name of "/update" in
|
||
the example solrconfig.xml. Existing installations without this "/update"
|
||
handler will continue to use the old update servlet and should see no
|
||
changes in behavior. For new-style update handlers, errors are now
|
||
reflected in the HTTP status code, Content-type checking is more strict,
|
||
and the response format has changed and is controllable via the wt
|
||
parameter.
|
||
|
||
|
||
|
||
Detailed Change List
|
||
--------------------
|
||
|
||
New Features
|
||
1. SOLR-82: Default field values can be specified in the schema.xml.
|
||
(Ryan McKinley via hossman)
|
||
|
||
2. SOLR-89: Two new TokenFilters with corresponding Factories...
|
||
* TrimFilter - Trims leading and trailing whitespace from Tokens
|
||
* PatternReplaceFilter - applies a Pattern to each token in the
|
||
stream, replacing match occurances with a specified replacement.
|
||
(hossman)
|
||
|
||
3. SOLR-91: allow configuration of a limit of the number of searchers
|
||
that can be warming in the background. This can be used to avoid
|
||
out-of-memory errors, or contention caused by more and more searchers
|
||
warming in the background. An error is thrown if the limit specified
|
||
by maxWarmingSearchers in solrconfig.xml is exceeded. (yonik)
|
||
|
||
4. SOLR-106: New faceting parameters that allow specification of a
|
||
minimum count for returned facets (facet.mincount), paging through facets
|
||
(facet.offset, facet.limit), and explicit sorting (facet.sort).
|
||
facet.zeros is now deprecated. (yonik)
|
||
|
||
5. SOLR-80: Negative queries are now allowed everywhere. Negative queries
|
||
are generated and cached as their positive counterpart, speeding
|
||
generation and generally resulting in smaller sets to cache.
|
||
Set intersections in SolrIndexSearcher are more efficient,
|
||
starting with the smallest positive set, subtracting all negative
|
||
sets, then intersecting with all other positive sets. (yonik)
|
||
|
||
6. SOLR-117: Limit a field faceting to constraints with a prefix specified
|
||
by facet.prefix or f.<field>.facet.prefix. (yonik)
|
||
|
||
7. SOLR-107: JAVA API: Change NamedList to use Java5 generics
|
||
and implement Iterable<Map.Entry> (Ryan McKinley via yonik)
|
||
|
||
8. SOLR-104: Support for "Update Plugins" -- RequestHandlers that want
|
||
access to streams of data for doing updates. ContentStreams can come
|
||
from the raw POST body, multi-part form data, or remote URLs.
|
||
Included in this change is a new SolrDispatchFilter that allows
|
||
RequestHandlers registered with names that begin with a "/" to be
|
||
accessed using a URL structure based on that name.
|
||
(Ryan McKinley via hossman)
|
||
|
||
9. SOLR-126: DirectUpdateHandler2 supports autocommitting after a specified time
|
||
(in ms), using <autoCommit><maxTime>10000</maxTime></autoCommit>.
|
||
(Ryan McKinley via klaas).
|
||
|
||
10. SOLR-116: IndexInfoRequestHandler added. (Erik Hatcher)
|
||
|
||
11. SOLR-79: Add system property ${<sys.prop>[:<default>]} substitution for
|
||
configuration files loaded, including schema.xml and solrconfig.xml.
|
||
(Erik Hatcher with inspiration from Andrew Saar)
|
||
|
||
12. SOLR-149: Changes to make Solr more easily embeddable, in addition
|
||
to logging which request handler handled each request.
|
||
(Ryan McKinley via yonik)
|
||
|
||
13. SOLR-86: Added standalone Java-based command-line updater.
|
||
(Erik Hatcher via Bertrand Delecretaz)
|
||
|
||
14. SOLR-152: DisMaxRequestHandler now supports configurable alternate
|
||
behavior when q is not specified. A "q.alt" param can be specified
|
||
using SolrQueryParser syntax as a mechanism for specifying what query
|
||
the dismax handler should execute if the main user query (q) is blank.
|
||
(Ryan McKinley via hossman)
|
||
|
||
15. SOLR-158: new "qs" (Query Slop) param for DisMaxRequestHandler
|
||
allows for specifying the amount of default slop to use when parsing
|
||
explicit phrase queries from the user.
|
||
(Adam Hiatt via hossman)
|
||
|
||
16. SOLR-81: SpellCheckerRequestHandler that uses the SpellChecker from
|
||
the Lucene contrib.
|
||
(Otis Gospodnetic and Adam Hiatt)
|
||
|
||
17. SOLR-182: allow lazy loading of request handlers on first request.
|
||
(Ryan McKinley via yonik)
|
||
|
||
18. SOLR-81: More SpellCheckerRequestHandler enhancements, inlcluding
|
||
support for relative or absolute directory path configurations, as
|
||
well as RAM based directory. (hossman)
|
||
|
||
19. SOLR-197: New parameters for input: stream.contentType for specifying
|
||
or overriding the content type of input, and stream.file for reading
|
||
local files. (Ryan McKinley via yonik)
|
||
|
||
20. SOLR-66: CSV data format for document additions and updates. (yonik)
|
||
|
||
21. SOLR-184: add echoHandler=true to responseHeader, support echoParams=all
|
||
(Ryan McKinley via ehatcher)
|
||
|
||
22. SOLR-211: Added a regex PatternTokenizerFactory. This extracts tokens
|
||
from the input string using a regex Pattern. (Ryan McKinley)
|
||
|
||
23. SOLR-162: Added a "Luke" request handler and other admin helpers.
|
||
This exposes the system status through the standard requestHandler
|
||
framework. (ryan)
|
||
|
||
24. SOLR-212: Added a DirectSolrConnection class. This lets you access
|
||
solr using the standard request/response formats, but does not require
|
||
an HTTP connection. It is designed for embedded applications. (ryan)
|
||
|
||
25. SOLR-204: The request dispatcher (added in SOLR-104) can handle
|
||
calls to /select. This offers uniform error handling for /update and
|
||
/select. To enable this behavior, you must add:
|
||
<requestDispatcher handleSelect="true" > to your solrconfig.xml
|
||
See the example solrconfig.xml for details. (ryan)
|
||
|
||
26. SOLR-170: StandardRequestHandler now supports a "sort" parameter.
|
||
Using the ';' syntax is still supported, but it is recommended to
|
||
transition to the new syntax. (ryan)
|
||
|
||
27. SOLR-181: The index schema now supports "required" fields. Attempts
|
||
to add a document without a required field will fail, returning a
|
||
descriptive error message. By default, the uniqueKey field is
|
||
a required field. This can be disabled by setting required=false
|
||
in schema.xml. (Greg Ludington via ryan)
|
||
|
||
28. SOLR-217: Fields configured in the schema to be neither indexed or
|
||
stored will now be quietly ignored by Solr when Documents are added.
|
||
The example schema has a comment explaining how this can be used to
|
||
ignore any "unknown" fields.
|
||
(Will Johnson via hossman)
|
||
|
||
29. SOLR-227: If schema.xml defines multiple fieldTypes, fields, or
|
||
dynamicFields with the same name, a severe error will be logged rather
|
||
then quietly continuing. Depending on the <abortOnConfigurationError>
|
||
settings, this may halt the server. Likewise, if solrconfig.xml
|
||
defines multiple RequestHandlers with the same name it will also add
|
||
an error. (ryan)
|
||
|
||
30. SOLR-226: Added support for dynamic field as the destination of a
|
||
copyField using glob (*) replacement. (ryan)
|
||
|
||
31. SOLR-224: Adding a PhoneticFilterFactory that uses apache commons codec
|
||
language encoders to build phonetically similar tokens. This currently
|
||
supports: DoubleMetaphone, Metaphone, Soundex, and RefinedSoundex (ryan)
|
||
|
||
32. SOLR-199: new n-gram tokenizers available via NGramTokenizerFactory
|
||
and EdgeNGramTokenizerFactory. (Adam Hiatt via yonik)
|
||
|
||
33. SOLR-234: TrimFilter can update the Token's startOffset and endOffset
|
||
if updateOffsets="true". By default the Token offsets are unchanged.
|
||
(ryan)
|
||
|
||
34. SOLR-208: new example_rss.xsl and example_atom.xsl to provide more
|
||
examples for people about the Solr XML response format and how they
|
||
can transform it to suit different needs.
|
||
(Brian Whitman via hossman)
|
||
|
||
35. SOLR-249: Deprecated SolrException( int, ... ) constructors in favor
|
||
of constructors that takes an ErrorCode enum. This will ensure that
|
||
all SolrExceptions use a valid HTTP status code. (ryan)
|
||
|
||
36. SOLR-386: Abstracted SolrHighlighter and moved existing implementation
|
||
to DefaultSolrHighlighter. Adjusted SolrCore and solrconfig.xml so
|
||
that highlighter is configurable via a class attribute. Allows users
|
||
to use their own highlighter implementation. (Tricia Williams via klaas)
|
||
|
||
Changes in runtime behavior
|
||
1. Highlighting using DisMax will only pick up terms from the main
|
||
user query, not boost or filter queries (klaas).
|
||
|
||
2. SOLR-125: Change default of json.nl to flat, change so that
|
||
json.nl only affects items where order matters (facet constraint
|
||
listings). Fix JSON output bug for null values. Internal JAVA API:
|
||
change most uses of NamedList to SimpleOrderedMap. (yonik)
|
||
|
||
3. A new method "getSolrQueryParser" has been added to the IndexSchema
|
||
class for retrieving a new SolrQueryParser instance with all options
|
||
specified in the schema.xml's <solrQueryParser> block set. The
|
||
documentation for the SolrQueryParser constructor and it's use of
|
||
IndexSchema have also been clarified.
|
||
(Erik Hatcher and hossman)
|
||
|
||
4. DisMaxRequestHandler's bq, bf, qf, and pf parameters can now accept
|
||
multiple values (klaas).
|
||
|
||
5. Query are re-written before highlighting is performed. This enables
|
||
proper highlighting of prefix and wildcard queries (klaas).
|
||
|
||
6. A meaningful exception is raised when attempting to add a doc missing
|
||
a unique id if it is declared in the schema and allowDups=false.
|
||
(ryan via klaas)
|
||
|
||
7. SOLR-183: Exceptions with error code 400 are raised when
|
||
numeric argument parsing fails. RequiredSolrParams class added
|
||
to facilitate checking for parameters that must be present.
|
||
(Ryan McKinley, J.J. Larrea via yonik)
|
||
|
||
8. SOLR-179: By default, solr will abort after any severe initalization
|
||
errors. This behavior can be disabled by setting:
|
||
<abortOnConfigurationError>false</abortOnConfigurationError>
|
||
in solrconfig.xml (ryan)
|
||
|
||
9. The example solrconfig.xml maps /update to XmlUpdateRequestHandler using
|
||
the new request dispatcher (SOLR-104). This requires posted content to
|
||
have a valid contentType: curl -H 'Content-type:text/xml; charset=utf-8'
|
||
The response format matches that of /select and returns standard error
|
||
codes. To enable solr1.1 style /update, do not map "/update" to any
|
||
handler in solrconfig.xml (ryan)
|
||
|
||
10. SOLR-231: If a charset is not specified in the contentType,
|
||
ContentStream.getReader() will use UTF-8 encoding. (ryan)
|
||
|
||
11. SOLR-230: More options for post.jar to support stdin, xml on the
|
||
commandline, and defering commits. Tutorial modified to take
|
||
advantage of these options so there is no need for curl.
|
||
(hossman)
|
||
|
||
12. SOLR-128: Upgraded Jetty to the latest stable release 6.1.3 (ryan)
|
||
|
||
Optimizations
|
||
1. SOLR-114: HashDocSet specific implementations of union() and andNot()
|
||
for a 20x performance improvement for those set operations, and a new
|
||
hash algorithm speeds up exists() by 10% and intersectionSize() by 8%.
|
||
(yonik)
|
||
|
||
2. SOLR-115: Solr now uses BooleanQuery.clauses() instead of
|
||
BooleanQuery.getClauses() in any situation where there is no risk of
|
||
modifying the original query.
|
||
(hossman)
|
||
|
||
3. SOLR-221: Speed up sorted faceting on multivalued fields by ~60%
|
||
when the base set consists of a relatively large portion of the
|
||
index. (yonik)
|
||
|
||
4. SOLR-221: Added a facet.enum.cache.minDf parameter which avoids
|
||
using the filterCache for terms that match few documents, trading
|
||
decreased memory usage for increased query time. (yonik)
|
||
|
||
Bug Fixes
|
||
1. SOLR-87: Parsing of synonym files did not correctly handle escaped
|
||
whitespace such as \r\n\t\b\f. (yonik)
|
||
|
||
2. SOLR-92: DOMUtils.getText (used when parsing config files) did not
|
||
work properly with many DOM implementations when dealing with
|
||
"Attributes". (Ryan McKinley via hossman)
|
||
|
||
3. SOLR-9,SOLR-99: Tighten up sort specification error checking, throw
|
||
exceptions for missing sort specifications or a sort on a non-indexed
|
||
field. (Ryan McKinley via yonik)
|
||
|
||
4. SOLR-145: Fix for bug introduced in SOLR-104 where some Exceptions
|
||
were being ignored by all "out of the box" RequestHandlers. (hossman)
|
||
|
||
5. SOLR-166: JNDI solr.home code refactoring. SOLR-104 moved
|
||
some JNDI related code to the init method of a Servlet Filter -
|
||
according to the Servlet Spec, all Filter's should be initialized
|
||
prior to initializing any Servlets, but this is not the case in at
|
||
least one Servlet Container (Resin). This "bug fix" refactors
|
||
this JNDI code so that it should be executed the first time any
|
||
attempt is made to use the solr.home dir.
|
||
(Ryan McKinley via hossman)
|
||
|
||
6. SOLR-173: Bug fix to SolrDispatchFilter to reduce "too many open
|
||
files" problem was that SolrDispatchFilter was not closing requests
|
||
when finished. Also modified ResponseWriters to only fetch a Searcher
|
||
reference if necessary for writing out DocLists.
|
||
(Ryan McKinley via hossman)
|
||
|
||
7. SOLR-168: Fix display positioning of multiple tokens at the same
|
||
position in analysis.jsp (yonik)
|
||
|
||
8. SOLR-167: The SynonymFilter sometimes generated incorrect offsets when
|
||
multi token synonyms were mached in the source text. (yonik)
|
||
|
||
9. SOLR-188: bin scripts do not support non-default webapp names. Added "-U"
|
||
option to specify a full path to the update url, overriding the
|
||
"-h" (hostname), "-p" (port) and "-w" (webapp name) parameters.
|
||
(Jeff Rodenburg via billa)
|
||
|
||
10. SOLR-198: RunExecutableListener always waited for the process to
|
||
finish, even when wait="false" was set. (Koji Sekiguchi via yonik)
|
||
|
||
11. SOLR-207: Changed distribution scripts to remove recursive find
|
||
and avoid use of "find -maxdepth" on platforms where it is not
|
||
supported. (yonik)
|
||
|
||
12. SOLR-222: Changing writeLockTimeout in solrconfig.xml did not
|
||
change the effective timeout. (Koji Sekiguchi via yonik)
|
||
|
||
13. Changed the SOLR-104 RequestDispatcher so that /select?qt=xxx can not
|
||
access handlers that start with "/". This makes path based authentication
|
||
possible for path based request handlers. (ryan)
|
||
|
||
14. SOLR-214: Some servlet containers (including Tomcat and Resin) do not
|
||
obey the specified charset. Rather then letting the the container handle
|
||
it solr now uses the charset from the header contentType to decode posted
|
||
content. Using the contentType: "text/xml; charset=utf-8" will force
|
||
utf-8 encoding. If you do not specify a contentType, it will use the
|
||
platform default. (Koji Sekiguchi via ryan)
|
||
|
||
15. SOLR-241: Undefined system properties used in configuration files now
|
||
cause a clear message to be logged rather than an obscure exception thrown.
|
||
(Koji Sekiguchi via ehatcher)
|
||
|
||
Other Changes
|
||
1. Updated to Lucene 2.1
|
||
|
||
2. Updated to Lucene 2007-05-20_00-04-53
|
||
|
||
================== Release 1.1.0 ==================
|
||
|
||
Status
|
||
------
|
||
This is the first release since Solr joined the Incubator, and brings many
|
||
new features and performance optimizations including highlighting,
|
||
faceted browsing, and JSON/Python/Ruby response formats.
|
||
|
||
|
||
Upgrading from previous Solr versions
|
||
-------------------------------------
|
||
Older Apache Solr installations can be upgraded by replacing
|
||
the relevant war file with the new version. No changes to configuration
|
||
files are needed and the index format has not changed.
|
||
|
||
The default version of the Solr XML response syntax has been changed to 2.2.
|
||
Behavior can be preserved for those clients not explicitly specifying a
|
||
version by adding a default to the request handler in solrconfig.xml
|
||
|
||
By default, Solr will no longer use a searcher that has not fully warmed,
|
||
and requests will block in the meantime. To change back to the previous
|
||
behavior of using a cold searcher in the event there is no other
|
||
warm searcher, see the useColdSearcher config item in solrconfig.xml
|
||
|
||
The XML response format when adding multiple documents to the collection
|
||
in a single <add> command has changed to return a single <result>.
|
||
|
||
|
||
Detailed Change List
|
||
--------------------
|
||
|
||
New Features
|
||
1. added support for setting Lucene's positionIncrementGap
|
||
2. Admin: new statistics for SolrIndexSearcher
|
||
3. Admin: caches now show config params on stats page
|
||
3. max() function added to FunctionQuery suite
|
||
4. postOptimize hook, mirroring the functionallity of the postCommit hook,
|
||
but only called on an index optimize.
|
||
5. Ability to HTTP POST query requests to /select in addition to HTTP-GET
|
||
6. The default search field may now be overridden by requests to the
|
||
standard request handler using the df query parameter. (Erik Hatcher)
|
||
7. Added DisMaxRequestHandler and SolrPluginUtils. (Chris Hostetter)
|
||
8. Support for customizing the QueryResponseWriter per request
|
||
(Mike Baranczak / SOLR-16 / hossman)
|
||
9. Added KeywordTokenizerFactory (hossman)
|
||
10. copyField accepts dynamicfield-like names as the source.
|
||
(Darren Erik Vengroff via yonik, SOLR-21)
|
||
11. new DocSet.andNot(), DocSet.andNotSize() (yonik)
|
||
12. Ability to store term vectors for fields. (Mike Klaas via yonik, SOLR-23)
|
||
13. New abstract BufferedTokenStream for people who want to write
|
||
Tokenizers or TokenFilters that require arbitrary buffering of the
|
||
stream. (SOLR-11 / yonik, hossman)
|
||
14. New RemoveDuplicatesToken - useful in situations where
|
||
synonyms, stemming, or word-deliminater-ing produce identical tokens at
|
||
the same position. (SOLR-11 / yonik, hossman)
|
||
15. Added highlighting to SolrPluginUtils and implemented in StandardRequestHandler
|
||
and DisMaxRequestHandler (SOLR-24 / Mike Klaas via hossman,yonik)
|
||
16. SnowballPorterFilterFactory language is configurable via the "language"
|
||
attribute, with the default being "English". (Bertrand Delacretaz via yonik, SOLR-27)
|
||
17. ISOLatin1AccentFilterFactory, instantiates ISOLatin1AccentFilter to remove accents.
|
||
(Bertrand Delacretaz via yonik, SOLR-28)
|
||
18. JSON, Python, Ruby QueryResponseWriters: use wt="json", "python" or "ruby"
|
||
(yonik, SOLR-31)
|
||
19. Make web admin pages return UTF-8, change Content-type declaration to include a
|
||
space between the mime-type and charset (Philip Jacob, SOLR-35)
|
||
20. Made query parser default operator configurable via schema.xml:
|
||
<solrQueryParser defaultOperator="AND|OR"/>
|
||
The default operator remains "OR".
|
||
21. JAVA API: new version of SolrIndexSearcher.getDocListAndSet() which takes
|
||
flags (Greg Ludington via yonik, SOLR-39)
|
||
22. A HyphenatedWordsFilter, a text analysis filter used during indexing to rejoin
|
||
words that were hyphenated and split by a newline. (Boris Vitez via yonik, SOLR-41)
|
||
23. Added a CompressableField base class which allows fields of derived types to
|
||
be compressed using the compress=true setting. The field type also gains the
|
||
ability to specify a size threshold at which field data is compressed.
|
||
(klaas, SOLR-45)
|
||
24. Simple faceted search support for fields (enumerating terms)
|
||
and arbitrary queries added to both StandardRequestHandler and
|
||
DisMaxRequestHandler. (hossman, SOLR-44)
|
||
25. In addition to specifying default RequestHandler params in the
|
||
solrconfig.xml, support has been added for configuring values to be
|
||
appended to the multi-val request params, as well as for configuring
|
||
invariant params that can not overridden in the query. (hossman, SOLR-46)
|
||
26. Default operator for query parsing can now be specified with q.op=AND|OR
|
||
from the client request, overriding the schema value. (ehatcher)
|
||
27. New XSLTResponseWriter does server side XSLT processing of XML Response.
|
||
In the process, an init(NamedList) method was added to QueryResponseWriter
|
||
which works the same way as SolrRequestHandler.
|
||
(Bertrand Delacretaz / SOLR-49 / hossman)
|
||
28. json.wrf parameter adds a wrapper-function around the JSON response,
|
||
useful in AJAX with dynamic script tags for specifying a JavaScript
|
||
callback function. (Bertrand Delacretaz via yonik, SOLR-56)
|
||
29. autoCommit can be specified every so many documents added (klaas, SOLR-65)
|
||
30. ${solr.home}/lib directory can now be used for specifying "plugin" jars
|
||
(hossman, SOLR-68)
|
||
31. Support for "Date Math" relative "NOW" when specifying values of a
|
||
DateField in a query -- or when adding a document.
|
||
(hossman, SOLR-71)
|
||
32. useColdSearcher control in solrconfig.xml prevents the first searcher
|
||
from being used before it's done warming. This can help prevent
|
||
thrashing on startup when multiple requests hit a cold searcher.
|
||
The default is "false", preventing use before warm. (yonik, SOLR-77)
|
||
|
||
Changes in runtime behavior
|
||
1. classes reorganized into different packages, package names changed to Apache
|
||
2. force read of document stored fields in QuerySenderListener
|
||
3. Solr now looks in ./solr/conf for config, ./solr/data for data
|
||
configurable via solr.solr.home system property
|
||
4. Highlighter params changed to be prefixed with "hl."; allow fragmentsize
|
||
customization and per-field overrides on many options
|
||
(Andrew May via klaas, SOLR-37)
|
||
5. Default param values for DisMaxRequestHandler should now be specified
|
||
using a '<lst name="defaults">...</lst>' init param, for backwards
|
||
compatability all init prams will be used as defaults if an init param
|
||
with that name does not exist. (hossman, SOLR-43)
|
||
6. The DisMaxRequestHandler now supports multiple occurances of the "fq"
|
||
param. (hossman, SOLR-44)
|
||
7. FunctionQuery.explain now uses ComplexExplanation to provide more
|
||
accurate score explanations when composed in a BooleanQuery.
|
||
(hossman, SOLR-25)
|
||
8. Document update handling locking is much sparser, allowing performance gains
|
||
through multiple threads. Large commits also might be faster (klaas, SOLR-65)
|
||
9. Lazy field loading can be enabled via a solrconfig directive. This will be faster when
|
||
not all stored fields are needed from a document (klaas, SOLR-52)
|
||
10. Made admin JSPs return XML and transform them with new XSL stylesheets
|
||
(Otis Gospodnetic, SOLR-58)
|
||
11. If the "echoParams=explicit" request parameter is set, request parameters are copied
|
||
to the output. In an XML output, they appear in new <lst name="params"> list inside
|
||
the new <lst name="responseHeader"> element, which replaces the old <responseHeader>.
|
||
Adding a version=2.1 parameter to the request produces the old format, for backwards
|
||
compatibility (bdelacretaz and yonik, SOLR-59).
|
||
|
||
Optimizations
|
||
1. getDocListAndSet can now generate both a DocList and a DocSet from a
|
||
single lucene query.
|
||
2. BitDocSet.intersectionSize(HashDocSet) no longer generates an intermediate
|
||
set
|
||
3. OpenBitSet completed, replaces BitSet as the implementation for BitDocSet.
|
||
Iteration is faster, and BitDocSet.intersectionSize(BitDocSet) and unionSize
|
||
is between 3 and 4 times faster. (yonik, SOLR-15)
|
||
4. much faster unionSize when one of the sets is a HashDocSet: O(smaller_set_size)
|
||
5. Optimized getDocSet() for term queries resulting in a 36% speedup of facet.field
|
||
queries where DocSets aren't cached (for example, if the number of terms in the field
|
||
is larger than the filter cache.) (yonik)
|
||
6. Optimized facet.field faceting by as much as 500 times when the field has
|
||
a single token per document (not multiValued & not tokenized) by using the
|
||
Lucene FieldCache entry for that field to tally term counts. The first request
|
||
utilizing the FieldCache will take longer than subsequent ones.
|
||
|
||
Bug Fixes
|
||
1. Fixed delete-by-id for field types who's indexed form is different
|
||
from the printable form (mainly sortable numeric types).
|
||
2. Added escaping of attribute values in the XML response (Erik Hatcher)
|
||
3. Added empty extractTerms() to FunctionQuery to enable use in
|
||
a MultiSearcher (Yonik)
|
||
4. WordDelimiterFilter sometimes lost token positionIncrement information
|
||
5. Fix reverse sorting for fields were sortMissingFirst=true
|
||
(Rob Staveley, yonik)
|
||
6. Worked around a Jetty bug that caused invalid XML responses for fields
|
||
containing non ASCII chars. (Bertrand Delacretaz via yonik, SOLR-32)
|
||
7. WordDelimiterFilter can throw exceptions if configured with both
|
||
generate and catenate off. (Mike Klaas via yonik, SOLR-34)
|
||
8. Escape '>' in XML output (because ]]> is illegal in CharData)
|
||
9. field boosts weren't being applied and doc boosts were being applied to fields (klaas)
|
||
10. Multiple-doc update generates well-formed xml (klaas, SOLR-65)
|
||
11. Better parsing of pingQuery from solrconfig.xml (hossman, SOLR-70)
|
||
12. Fixed bug with "Distribution" page introduced when Versions were
|
||
added to "Info" page (hossman)
|
||
13. Fixed HTML escaping issues with user input to analysis.jsp and action.jsp
|
||
(hossman, SOLR-74)
|
||
|
||
Other Changes
|
||
1. Upgrade to Lucene 2.0 nightly build 2006-06-22, lucene SVN revision 416224,
|
||
http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?view=markup&pathrev=416224
|
||
2. Modified admin styles to improve display in Internet Explorer (Greg Ludington via billa, SOLR-6)
|
||
3. Upgrade to Lucene 2.0 nightly build 2006-07-15, lucene SVN revision 422302,
|
||
4. Included unique key field name/value (if available) in log message of add (billa, SOLR-18)
|
||
5. Updated to Lucene 2.0 nightly build 2006-09-07, SVN revision 462111
|
||
6. Added javascript to catch empty query in admin query forms (Tomislav Nakic-Alfirevic via billa, SOLR-48
|
||
7. blackslash escape * in ssh command used in snappuller for zsh compatibility, SOLR-63
|
||
8. check solr return code in admin scripts, SOLR-62
|
||
9. Updated to Lucene 2.0 nightly build 2006-11-15, SVN revision 475069
|
||
10. Removed src/apps containing the legacy "SolrTest" app (hossman, SOLR-3)
|
||
11. Simplified index.jsp and form.jsp, primarily by removing/hiding XML
|
||
specific params, and adding an option to pick the output type. (hossman)
|
||
12. Added new numeric build property "specversion" to allow clean
|
||
MANIFEST.MF files (hossman)
|
||
13. Added Solr/Lucene versions to "Info" page (hossman)
|
||
14. Explicitly set mime-type of .xsl files in web.xml to
|
||
application/xslt+xml (hossman)
|
||
15. Config parsing should now work useing DOM Level 2 parsers -- Solr
|
||
previously relied on getTextContent which is a DOM Level 3 addition
|
||
(Alexander Saar via hossman, SOLR-78)
|
||
|
||
2006/01/17 Solr open sourced, moves to Apache Incubator
|