mirror of https://github.com/apache/lucene.git
5817 lines
261 KiB
Plaintext
5817 lines
261 KiB
Plaintext
Apache Solr Release Notes
|
|
|
|
Introduction
|
|
------------
|
|
Apache Solr is an open source enterprise search server based on the Apache Lucene Java
|
|
search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search,
|
|
caching, replication, and a web administration interface. It runs in a Java
|
|
servlet container such as Tomcat.
|
|
|
|
See http://lucene.apache.org/solr for more information.
|
|
|
|
|
|
Getting Started
|
|
---------------
|
|
You need a Java 1.6 VM or later installed.
|
|
In this release, there is an example Solr server including a bundled
|
|
servlet container in the directory named "example".
|
|
See the tutorial at http://lucene.apache.org/solr/tutorial.html
|
|
|
|
|
|
$Id$
|
|
|
|
================== 5.0.0 ==================
|
|
|
|
(No changes)
|
|
|
|
================== 4.1.0 ==================
|
|
|
|
Versions of Major Components
|
|
---------------------
|
|
Apache Tika 1.2
|
|
Carrot2 3.5.0
|
|
Velocity 1.6.4 and Velocity Tools 2.0
|
|
Apache UIMA 2.3.1
|
|
Apache ZooKeeper 3.4.5
|
|
|
|
Upgrading from Solr 4.0.0-BETA
|
|
----------------------
|
|
|
|
Custom java parsing plugins need to migrade from throwing the internal
|
|
ParseException to throwing SyntaxError.
|
|
|
|
|
|
Detailed Change List
|
|
----------------------
|
|
|
|
New Features
|
|
----------------------
|
|
|
|
* SOLR-2255: Enhanced pivot faceting to use local-params in the same way that
|
|
regular field value faceting can. This means support for excluding a filter
|
|
query, using a different output key, and specifying 'threads' to do
|
|
facet.method=fcs concurrently. PivotFacetHelper now extends SimpleFacet and
|
|
the getFacetImplementation() extension hook was removed. (dsmiley)
|
|
|
|
* SOLR-3897: A highlighter parameter "hl.preserveMulti" to return all of the
|
|
values of a multiValued field in their original order when highlighting.
|
|
(Joel Bernstein via yonik)
|
|
|
|
* SOLR-3929: Support configuring IndexWriter max thread count in solrconfig.
|
|
(phunt via Mark Miller)
|
|
|
|
* SOLR-3906: Add support for AnalyzingSuggester (LUCENE-3842), where the
|
|
underlying analyzed form used for suggestions is separate from the returned
|
|
text. (Robert Muir)
|
|
|
|
* SOLR-3985: ExternalFileField caches can be reloaded on firstSearcher/
|
|
newSearcher events using the ExternalFileFieldReloader (Alan Woodward)
|
|
|
|
* SOLR-3911: Make Directory and DirectoryFactory first class so that the majority
|
|
of Solr's features work with any custom implementations. (Mark Miller)
|
|
Additional Work:
|
|
- SOLR-4032: Files larger than an internal buffer size fail to replicate.
|
|
(Mark Miller, Markus Jelsma)
|
|
|
|
* SOLR-1972: Add extra statistics to RequestHandlers - 5 & 15-minute reqs/sec
|
|
rolling averages; median, 75th, 95th, 99th, 99.9th percentile request times
|
|
(Alan Woodward, Shawn Heisey, Adrien Grand)
|
|
|
|
* SOLR-4051: Add <propertyWriter /> element to DIH's data-config.xml file,
|
|
allowing the user to specify the location, filename and Locale for
|
|
the "data-config.properties" file. Alternatively, users can specify their
|
|
own property writer implementation for greater control. This new configuration
|
|
element is optional, and defaults mimic prior behavior. The one exception is
|
|
that the "root" locale is default. Previously it was the machine's default locale.
|
|
(James Dyer)
|
|
|
|
* SOLR-4084: Add FuzzyLookupFactory, which is like AnalyzingSuggester except that
|
|
it can tolerate typos in the input. (Areek Zillur via Robert Muir)
|
|
|
|
* SOLR-4088: New and improved auto host detection strategy for SolrCloud.
|
|
(Raintung Li via Mark Miller)
|
|
|
|
* SOLR-3970: SystemInfoHandler now exposes more details about the
|
|
JRE/VM/Java version in use. (hossman)
|
|
|
|
* SOLR-4101: Add support for storing term offsets in the index via a
|
|
'storeOffsetsWithPositions' flag on field definitions in the schema.
|
|
(Tom Winch, Alan Woodward)
|
|
|
|
* SOLR-4093: Solr QParsers may now be directly invoked in the lucene
|
|
query syntax without the _query_ magic field hack.
|
|
Example: foo AND {!term f=myfield v=$qq}
|
|
(yonik)
|
|
|
|
* SOLR-4087: Add MAX_DOC_FREQ option to MoreLikeThis.
|
|
(Andrew Janowczyk via Mark Miller)
|
|
|
|
|
|
Optimizations
|
|
----------------------
|
|
|
|
* SOLR-3788: Admin Cores UI should redirect to newly created core details
|
|
(steffkes)
|
|
|
|
* SOLR-3895: XML and XSLT UpdateRequestHandler should not try to resolve
|
|
external entities. This improves speed of loading e.g. XSL-transformed
|
|
XHTML documents. (Martin Herfurt, uschindler, hossman)
|
|
|
|
* SOLR-3614: Fix XML parsing in XPathEntityProcessor to correctly expand
|
|
named entities, but ignore external entities. (uschindler, hossman)
|
|
|
|
* SOLR-3734: Improve Schema-Browser Handling for CopyField using
|
|
dynamicField's (steffkes)
|
|
|
|
* SOLR-3941: The "commitOnLeader" part of distributed recovery can use
|
|
openSearcher=false. (Tomas Fernandez Lobbe via Mark Miller)
|
|
|
|
* SOLR-4063: Allow CoreContainer to load multiple SolrCores in parallel rather
|
|
than just serially. (Mark Miller)
|
|
|
|
Bug Fixes
|
|
----------------------
|
|
|
|
* SOLR-4007: Morfologik dictionaries not available in Solr field type
|
|
due to class loader lookup problems. (Lance Norskog, Dawid Weiss)
|
|
|
|
* SOLR-3560: Handle different types of Exception Messages for Logging UI
|
|
(steffkes)
|
|
|
|
* SOLR-3637: Commit Status at Core-Admin UI is always false (steffkes)
|
|
|
|
* SOLR-3917: Partial State on Schema-Browser UI is not defined for Dynamic
|
|
Fields & Types (steffkes)
|
|
|
|
* SOLR-3939: Consider a sync attempt from leader to replica that fails due
|
|
to 404 a success. (Mark Miller, Joel Bernstein)
|
|
|
|
* SOLR-3940: Rejoining the leader election incorrectly triggers the code path
|
|
for a fresh cluster start rather than fail over. (Mark Miller)
|
|
|
|
* SOLR-3961: Fixed error using LimitTokenCountFilterFactory
|
|
(Jack Krupansky, hossman)
|
|
|
|
* SOLR-3933: Distributed commits are not guaranteed to be ordered within a
|
|
request. (Mark Miller)
|
|
|
|
* SOLR-3939: An empty or just replicated index cannot become the leader of a
|
|
shard after a leader goes down. (Joel Bernstein, yonik, Mark Miller)
|
|
|
|
* SOLR-3971: A collection that is created with numShards=1 turns into a
|
|
numShards=2 collection after starting up a second core and not specifying
|
|
numShards. (Mark Miller)
|
|
|
|
* SOLR-3988: Fixed SolrTestCaseJ4.adoc(SolrInputDocument) to respect
|
|
field and document boosts (hossman)
|
|
|
|
* SOLR-3981: Fixed bug that resulted in document boosts being compounded in
|
|
<copyField/> destination fields. (hossman)
|
|
|
|
* SOLR-3920: Fix server list caching in CloudSolrServer when using more than one
|
|
collection list with the same instance. (Grzegorz Sobczyk, Mark Miller)
|
|
|
|
* SOLR-3938: prepareCommit command omits commitData causing a failure to trigger
|
|
replication to slaves. (yonik)
|
|
|
|
* SOLR-3992: QuerySenderListener doesn't populate document cache.
|
|
(Shotaro Kamio, yonik)
|
|
|
|
* SOLR-3995: Recovery may never finish on SolrCore shutdown if the last reference to
|
|
a SolrCore is closed by the recovery process. (Mark Miller)
|
|
|
|
* SOLR-3998: Atomic update on uniqueKey field itself causes duplicate document.
|
|
(Eric Spencer, yonik)
|
|
|
|
* SOLR-4001: In CachingDirectoryFactory#close, if there are still refs for a
|
|
Directory outstanding, we need to wait for them to be released before closing.
|
|
(Mark Miller)
|
|
|
|
* SOLR-4005: If CoreContainer fails to register a created core, it should close it.
|
|
(Mark Miller)
|
|
|
|
* SOLR-4009: OverseerCollectionProcessor is not resilient to many error conditions
|
|
and can stop running on errors. (Raintung Li, milesli, Mark Miller)
|
|
|
|
* SOLR-4019: Log stack traces for 503/Service Unavailable SolrException if not
|
|
thrown by PingRequestHandler. Do not log exceptions if a user tries to view a
|
|
hidden file using ShowFileRequestHandler. (Tomás Fernández Löbbe via James Dyer)
|
|
|
|
* SOLR-3589: Edismax parser does not honor mm parameter if analyzer splits a token.
|
|
(Tom Burton-West, Robert Muir)
|
|
|
|
* SOLR-4031: Upgrade to Jetty 8.1.7 to fix a bug where in very rare occasions
|
|
the content of two concurrent requests get mixed up. (Per Steffensen, yonik)
|
|
|
|
* SOLR-4060: ReplicationHandler can try and do a snappull and open a new IndexWriter
|
|
after shutdown has already occurred, leaving an IndexWriter that is not closed.
|
|
(Mark Miller)
|
|
|
|
* SOLR-4055: Fix a thread safety issue with the Collections API that could
|
|
cause actions to be targeted at the wrong SolrCores.
|
|
(Raintung Li, Per Steffensen via Mark Miller)
|
|
|
|
* SOLR-3993: If multiple SolrCore's for a shard coexist on a node, on cluster
|
|
restart, leader election would stall until timeout, waiting to see all of
|
|
the replicas come up. (Mark Miller, Alexey Kudinov)
|
|
|
|
* SOLR-2045: Databases that require a commit to be issued before closing the
|
|
connection on a non-read-only database leak connections. Also expanded the
|
|
SqlEntityProcessor test to sometimes use Derby as well as HSQLDB (Derby is
|
|
one db affected by this bug). (Fenlor Sebastia, James Dyer)
|
|
|
|
* SOLR-4064: When there is an unexpected exception while trying to run the new
|
|
leader process, the SolrCore will not correctly rejoin the election.
|
|
(Po Rui Via Mark Miller)
|
|
|
|
* SOLR-3989: SolrZkClient constructor dropped exception cause when throwing
|
|
a new RuntimeException. (Colin Bartolome, yonik)
|
|
|
|
* SOLR-4036: field aliases in fl should not cause properties of target field
|
|
to be used. (Martin Koch, yonik)
|
|
|
|
* SOLR-4003: The SolrZKClient clean method should not try and clear zk paths
|
|
that start with /zookeeper, as this can fail and stop the removal of
|
|
further nodes. (Mark Miller)
|
|
|
|
* SOLR-4076: SolrQueryParser should run fuzzy terms through
|
|
MultiTermAwareComponents to ensure that (for example) a fuzzy query of
|
|
foobar~2 is equivalent to FooBar~2 on a field that includes lowercasing.
|
|
(yonik)
|
|
|
|
* SOLR-4081: QueryParsing.toString, used during debugQuery=true, did not
|
|
correctly handle ExtendedQueries such as WrappedQuery
|
|
(used when cache=false), spatial queries, and frange queires.
|
|
(Eirik Lygre, yonik)
|
|
|
|
* SOLR-3959: Ensure the internal comma separator of poly fields is escaped
|
|
for CSVResponseWriter. (Areek Zillur via Robert Muir)
|
|
|
|
* SOLR-4075: A logical shard that has had all of it's SolrCores unloaded should
|
|
be removed from the cluster state. (Mark Miller, Gilles Comeau)
|
|
|
|
* SOLR-4034: Check if a collection already exists before trying to create a
|
|
new one. (Po Rui, Mark Miller)
|
|
|
|
* SOLR-4097: Race can cause NPE in logging line on first cluster state update.
|
|
(Mark Miller)
|
|
|
|
* SOLR-4099: Allow the collection api work queue to make forward progress even
|
|
when it's watcher is not fired for some reason. (Raintung Li via Mark Miller)
|
|
|
|
* SOLR-3960: Fixed a bug where Distributed Grouping ignored PostFilters
|
|
(Nathan Visagan, hossman)
|
|
|
|
* SOLR-3842: DIH would not populate multivalued fields if the column name
|
|
derives from a resolved variable (James Dyer)
|
|
|
|
* SOLR-4117: Retrieving the size of the index may use the wrong index dir if
|
|
you are replicating.
|
|
(Mark Miller, Markus Jelsma)
|
|
|
|
* SOLR-2890: Fixed a bug that prevented omitNorms and omitTermFreqAndPositions
|
|
options from being respected in some <fieldType/> declarations (hossman)
|
|
|
|
Other Changes
|
|
----------------------
|
|
|
|
* SOLR-3899: SolrCore should not log at warning level when the index directory
|
|
changes - it's an info event. (Tobias Bergman, Mark Miller)
|
|
|
|
* SOLR-3861: Refactor SolrCoreState so that it's managed by SolrCore.
|
|
(Mark Miller, hossman)
|
|
|
|
* SOLR-3966: Eliminate superfluous warning from LanguageIdentifierUpdateProcessor
|
|
(Markus Jelsma via hossman)
|
|
|
|
* SOLR-3932: SolrCmdDistributorTest either takes 3 seconds or 3 minutes.
|
|
(yonik, Mark Miller)
|
|
|
|
* SOLR-3856: New tests for SqlEntityProcessor/CachedSqlEntityProcessor
|
|
(James Dyer)
|
|
|
|
* SOLR-4067: ZkStateReader#getLeaderProps should not return props for a leader
|
|
that it does not think is live. (Mark Miller)
|
|
|
|
* SOLR-4086: DIH refactor of VariableResolver and Evaluator. VariableResolver
|
|
and each built-in Evaluator are separate concrete classes. DateFormatEvaluator
|
|
now defaults with the ROOT Locale. However, users may specify a different
|
|
Locale using an optional new third parameter. (James Dyer)
|
|
|
|
* SOLR-3602: Update ZooKeeper to 3.4.5 (Mark Miller)
|
|
|
|
* SOLR-4095: DIH NumberFormatTransformer & DateFormatTransformer default to the
|
|
ROOT Locale if none is specified. These previously used the machine's default.
|
|
(James Dyer)
|
|
|
|
* SOLR-4096: DIH FileDataSource & FieldReaderDataSource default to UTF-8 encoding
|
|
if none is specified. These previously used the machine's default.
|
|
(James Dyer)
|
|
|
|
* SOLR-1916: DIH to not use Lucene-forbidden Java APIs
|
|
(default encoding, locale, etc.) (James Dyer, Robert Muir)
|
|
|
|
* SOLR-4111: SpellCheckCollatorTest#testContextSensitiveCollate to test against
|
|
both DirectSolrSpellChecker & IndexBasedSpellChecker
|
|
(Tomás Fernández Löbbe via James Dyer)
|
|
|
|
* SOLR-2141: Better test coverage for Evaluators (James Dyer)
|
|
|
|
* SOLR-4119: Update Guava to 13.0.1 (Mark Miller)
|
|
|
|
* SOLR-4074: Raise default ramBufferSizeMB to 100 from 32.
|
|
(yonik, Mark Miller)
|
|
|
|
================== 4.0.0 ==================
|
|
|
|
Versions of Major Components
|
|
---------------------
|
|
Apache Tika 1.2
|
|
Carrot2 3.5.0
|
|
Velocity 1.6.4 and Velocity Tools 2.0
|
|
Apache UIMA 2.3.1
|
|
Apache ZooKeeper 3.3.6
|
|
|
|
Upgrading from Solr 4.0.0-BETA
|
|
----------------------
|
|
|
|
In order to better support distributed search mode, the TermVectorComponent's
|
|
response format has been changed so that if the schema defines a
|
|
uniqueKeyField, then that field value is used as the "key" for each document in
|
|
it's response section, instead of the internal lucene doc id. Users w/o a
|
|
uniqueKeyField will continue to see the same response format. See SOLR-3229
|
|
for more details.
|
|
|
|
If you are using SolrCloud's distributed update request capabilities and a non
|
|
string type id field, you must re-index.
|
|
|
|
Upgrading from Solr 4.0.0-ALPHA
|
|
----------------------
|
|
|
|
Solr is now much more strict about requiring that the uniqueKeyField feature
|
|
(if used) must refer to a field which is not multiValued. If you upgrade from
|
|
an earlier version of Solr and see an error that your uniqueKeyField "can not
|
|
be configured to be multivalued" please add 'multiValued="false"' to the
|
|
<field /> declaration for your uniqueKeyField. See SOLR-3682 for more details.
|
|
|
|
In addition, please review the notes above about upgrading from 4.0.0-BETA
|
|
|
|
Upgrading from Solr 3.6
|
|
----------------------
|
|
|
|
* The Lucene index format has changed and as a result, once you upgrade,
|
|
previous versions of Solr will no longer be able to read your indices.
|
|
In a master/slave configuration, all searchers/slaves should be upgraded
|
|
before the master. If the master were to be updated first, the older
|
|
searchers would not be able to read the new index format.
|
|
|
|
* Setting abortOnConfigurationError=false is no longer supported
|
|
(since it has never worked properly). Solr will now warn you if
|
|
you attempt to set this configuration option at all. (see SOLR-1846)
|
|
|
|
* The default logic for the 'mm' param of the 'dismax' QParser has
|
|
been changed. If no 'mm' param is specified (either in the query,
|
|
or as a default in solrconfig.xml) then the effective value of the
|
|
'q.op' param (either in the query or as a default in solrconfig.xml
|
|
or from the 'defaultOperator' option in schema.xml) is used to
|
|
influence the behavior. If q.op is effectively "AND" then mm=100%.
|
|
If q.op is effectively "OR" then mm=0%. Users who wish to force the
|
|
legacy behavior should set a default value for the 'mm' param in
|
|
their solrconfig.xml file.
|
|
|
|
* The VelocityResponseWriter is no longer built into the core. Its JAR and
|
|
dependencies now need to be added (via <lib> or solr/home lib inclusion),
|
|
and it needs to be registered in solrconfig.xml like this:
|
|
<queryResponseWriter name="velocity" class="solr.VelocityResponseWriter"/>
|
|
|
|
* The update request parameter to choose Update Request Processor Chain is
|
|
renamed from "update.processor" to "update.chain". The old parameter was
|
|
deprecated but still working since Solr3.2, but is now removed
|
|
entirely.
|
|
|
|
* The <indexDefaults> and <mainIndex> sections of solrconfig.xml are discontinued
|
|
and replaced with the <indexConfig> section. There are also better defaults.
|
|
When migrating, if you don't know what your old settings mean, simply delete
|
|
both <indexDefaults> and <mainIndex> sections. If you have customizations,
|
|
put them in <indexConfig> section - with same syntax as before.
|
|
|
|
* Two of the SolrServer subclasses in SolrJ were renamed/replaced.
|
|
CommonsHttpSolrServer is now HttpSolrServer, and
|
|
StreamingUpdateSolrServer is now ConcurrentUpdateSolrServer.
|
|
|
|
* The PingRequestHandler no longer looks for a <healthcheck/> option in the
|
|
(legacy) <admin> section of solrconfig.xml. Users who wish to take
|
|
advantage of this feature should configure a "healthcheckFile" init param
|
|
directly on the PingRequestHandler. As part of this change, relative file
|
|
paths have been fixed to be resolved against the data dir. See the example
|
|
solrconfig.xml and SOLR-1258 for more details.
|
|
|
|
* Due to low level changes to support SolrCloud, the uniqueKey field can no
|
|
longer be populated via <copyField/> or <field default=...> in the
|
|
schema.xml. Users wishing to have Solr automatically generate a uniqueKey
|
|
value when adding documents should instead use an instance of
|
|
solr.UUIDUpdateProcessorFactory in their update processor chain. See
|
|
SOLR-2796 for more details.
|
|
|
|
In addition, please review the notes above about upgrading from 4.0.0-BETA, and 4.0.0-ALPHA
|
|
|
|
Detailed Change List
|
|
----------------------
|
|
|
|
New Features
|
|
----------------------
|
|
|
|
* SOLR-3670: New CountFieldValuesUpdateProcessorFactory makes it easy to index
|
|
the number of values in another field for later use at query time. (hossman)
|
|
|
|
* SOLR-2768: new "mod(x,y)" function for computing the modulus of two value
|
|
sources. (hossman)
|
|
|
|
* SOLR-3238: Numerous small improvements to the Admin UI (steffkes)
|
|
|
|
* SOLR-3597: seems like a lot of wasted whitespace at the top of the admin screens
|
|
(steffkes)
|
|
|
|
* SOLR-3304: Added Solr adapters for Lucene 4's new spatial module. With
|
|
SpatialRecursivePrefixTreeFieldType ("location_rpt" in example schema), it is
|
|
possible to index a variable number of points per document (and sort on them),
|
|
index not just points but any Spatial4j supported shape such as polygons, and
|
|
to query on these shapes too. Polygons requires adding JTS to the classpath.
|
|
(David Smiley)
|
|
|
|
* SOLR-3825: Added optional capability to log what ids are in a response
|
|
(Scott Stults via gsingers)
|
|
|
|
* SOLR-3821: Added 'df' to the UI Query form (steffkes)
|
|
|
|
* SOLR-3822: Added hover titles to the edismax params on the UI Query form
|
|
(steffkes)
|
|
|
|
Optimizations
|
|
----------------------
|
|
|
|
* SOLR-3715: improve concurrency of the transaction log by removing
|
|
synchronization around log record serialization. (yonik)
|
|
|
|
* SOLR-3807: Currently during recovery we pause for a number of seconds after
|
|
waiting for the leader to see a recovering state so that any previous updates
|
|
will have finished before our commit on the leader - we don't need this wait
|
|
for peersync. (Mark Miller)
|
|
|
|
* SOLR-3837: When a leader is elected and asks replicas to sync back to him and
|
|
that fails, we should ask those nodes to recovery asynchronously rather than
|
|
synchronously. (Mark Miller)
|
|
|
|
* SOLR-3709: Cache the url list created from the ClusterState in CloudSolrServer
|
|
on each request. (Mark Miller)
|
|
|
|
|
|
Bug Fixes
|
|
----------------------
|
|
|
|
* SOLR-3685: Solr Cloud sometimes skipped peersync attempt and replicated instead due
|
|
to tlog flags not being cleared when no updates were buffered during a previous
|
|
replication. (Markus Jelsma, Mark Miller, yonik)
|
|
|
|
* SOLR-3229: Fixed TermVectorComponent to work with distributed search
|
|
(Hang Xie, hossman)
|
|
|
|
* SOLR-3725: Fixed package-local-src-tgz target to not bring in unnecessary jars
|
|
and binary contents. (Michael Dodsworth via rmuir)
|
|
|
|
* SOLR-3649: Fixed bug in JavabinLoader that caused deleteById(List<String> ids)
|
|
to not work in SolrJ (siren)
|
|
|
|
* SOLR-3730: Rollback is not implemented quite right and can cause corner case fails in
|
|
SolrCloud tests. (rmuir, Mark Miller)
|
|
|
|
* SOLR-2981: Fixed StatsComponent to no longer return duplicated information
|
|
when requesting multiple stats.facet fields.
|
|
(Roman Kliewer via hossman)
|
|
|
|
* SOLR-3743: Fixed issues with atomic updates and optimistic concurrency in
|
|
conjunction with stored copyField targets by making real-time get never
|
|
return copyField targets. (yonik)
|
|
|
|
* SOLR-3746: Proper error reporting if updateLog is configured w/o necessary
|
|
"_version_" field in schema.xml (hossman)
|
|
|
|
* SOLR-3745: Proper error reporting if SolrCloud mode is used w/o
|
|
necessary "_version_" field in schema.xml (hossman)
|
|
|
|
* SOLR-3770: Overseer may lose updates to cluster state (siren)
|
|
|
|
* SOLR-3721: Fix bug that could theoretically allow multiple recoveries to run
|
|
briefly at the same time if the recovery thread join call was interrupted.
|
|
(Per Steffensen, Mark Miller)
|
|
|
|
* SOLR-3782: A leader going down while updates are coming in can cause shard
|
|
inconsistency. (Mark Miller)
|
|
|
|
* SOLR-3611: We do not show ZooKeeper data in the UI for a node that has children.
|
|
(Mark Miller)
|
|
|
|
* SOLR-3789: Fix bug in SnapPuller that caused "internal" compression to fail.
|
|
(siren)
|
|
|
|
* SOLR-3790: ConcurrentModificationException could be thrown when using hl.fl=*.
|
|
Fixed in r1231606. (yonik, koji)
|
|
|
|
* SOLR-3668: DataImport : Specifying Custom Parameters (steffkes)
|
|
|
|
* SOLR-3793: UnInvertedField faceting cached big terms in the filter
|
|
cache that ignored deletions, leading to duplicate documents in search
|
|
later when a filter of the same term was specified.
|
|
(Günter Hipler, hossman, yonik)
|
|
|
|
* SOLR-3679: Core Admin UI gives no feedback if "Add Core" fails (steffkes, hossman)
|
|
|
|
* SOLR-3795: Fixed LukeRequestHandler response to correctly return field name
|
|
strings in copyDests and copySources arrays (hossman)
|
|
|
|
* SOLR-3699: Fixed some Directory leaks when there were errors during SolrCore
|
|
or SolrIndexWriter initialization. (hossman)
|
|
|
|
* SOLR-3518: Include final 'hits' in log information when aggregating a
|
|
distibuted request (Markus Jelsma via hossman)
|
|
|
|
* SOLR-3628: SolrInputField and SolrInputDocument are now consistently backed
|
|
by Collections passed in to setValue/setField, and defensively copy values
|
|
from Collections passed to addValue/addField
|
|
(Tom Switzer via hossman)
|
|
|
|
* SOLR-3595: CurrencyField now generates an appropriate error on schema init
|
|
if it is configured as multiValued - this has never been properly supported,
|
|
but previously failed silently in odd ways. (hossman)
|
|
|
|
* SOLR-3823: Fix 'bq' parsing in edismax. Please note that this required
|
|
reverting the negative boost support added by SOLR-3278 (hossman)
|
|
|
|
* SOLR-3827: Fix shareSchema=true in solr.xml
|
|
(Tomás Fernández Löbbe via hossman)
|
|
|
|
* SOLR-3809: Fixed config file replication when subdirectories are used
|
|
(Emmanuel Espina via hossman)
|
|
|
|
* SOLR-3828: Fixed QueryElevationComponent so that using 'markExcludes' does
|
|
not modify the result set or ranking of 'excluded' documents relative to
|
|
not using elevation at all. (Alexey Serba via hossman)
|
|
|
|
* SOLR-3569: Fixed debug output on distributed requests when there are no
|
|
results found. (David Bowen via hossman)
|
|
|
|
* SOLR-3811: Query Form using wrong values for dismax, edismax (steffkes)
|
|
|
|
* SOLR-3779: DataImportHandler's LineEntityProcessor when used in conjunction
|
|
with FileListEntityProcessor would only process the first file.
|
|
(Ahmet Arslan via James Dyer)
|
|
|
|
* SOLR-3791: CachedSqlEntityProcessor would throw a NullPointerException when
|
|
a query returns a row with a NULL key. (Steffen Moelter via James Dyer)
|
|
|
|
* SOLR-3833: When a election is started because a leader went down, the new
|
|
leader candidate should decline if the last state they published was not
|
|
active. (yonik, Mark Miller)
|
|
|
|
* SOLR-3836: When doing peer sync, we should only count sync attempts that
|
|
cannot reach the given host as success when the candidate leader is
|
|
syncing with the replicas - not when replicas are syncing to the leader.
|
|
(Mark Miller)
|
|
|
|
* SOLR-3835: In our leader election algorithm, if on connection loss we found
|
|
we did not create our election node, we should retry, not throw an exception.
|
|
(Mark Miller)
|
|
|
|
* SOLR-3834: A new leader on cluster startup should also run the leader sync
|
|
process in case there was a bad cluster shutdown. (Mark Miller)
|
|
|
|
* SOLR-3772: On cluster startup, we should wait until we see all registered
|
|
replicas before running the leader process - or if they all do not come up,
|
|
N amount of time. (Mark Miller)
|
|
|
|
* SOLR-3756: If we are elected the leader of a shard, but we fail to publish
|
|
this for any reason, we should clean up and re trigger a leader election.
|
|
(Mark Miller)
|
|
|
|
* SOLR-3812: ConnectionLoss during recovery can cause lost updates, leading to
|
|
shard inconsistency. (Mark Miller)
|
|
|
|
* SOLR-3813: When a new leader syncs, we need to ask all shards to sync back,
|
|
not just those that are active. (Mark Miller)
|
|
|
|
* SOLR-3641: CoreContainer is not persisting roles core attribute.
|
|
(hossman, Mark Miller)
|
|
|
|
* SOLR-3527: SolrCmdDistributor drops some of the important commit attributes
|
|
(maxOptimizeSegments, softCommit, expungeDeletes) when sending a commit to
|
|
replicas. (Andy Laird, Tomas Fernandez Lobbe, Mark Miller)
|
|
|
|
* SOLR-3844: SolrCore reload can fail because it tries to remove the index
|
|
write lock while already holding it. (Mark Miller)
|
|
|
|
* SOLR-3831: Atomic updates do not distribute correctly to other nodes.
|
|
(Jim Musil, Mark Miller)
|
|
|
|
* SOLR-3465: Replication causes two searcher warmups.
|
|
(Michael Garski, Mark Miller)
|
|
|
|
* SOLR-3645: /terms should default to distrib=false. (Nick Cotton, Mark Miller)
|
|
|
|
* SOLR-3759: Various fixes to the example-DIH configs (Ahmet Arslan, hossman)
|
|
|
|
* SOLR-3777: Dataimport-UI does not send unchecked checkboxes (Glenn MacStravic
|
|
via steffkes)
|
|
|
|
* SOLR-3850: DataImportHandler "cacheKey" parameter was incorrectly renamed "cachePk"
|
|
(James Dyer)
|
|
|
|
* SOLR-3087: Fixed DOMUtil so that code doing attribute validation will
|
|
automaticly ignore nodes in the resserved "xml" prefix - in particular this
|
|
fixes some bugs related to xinclude and fieldTypes.
|
|
(Amit Nithian, hossman)
|
|
|
|
* SOLR-3783: Fixed Pivot Faceting to work with facet.missing=true (hossman)
|
|
|
|
* SOLR-3869: A PeerSync attempt to it's replicas by a candidate leader should
|
|
not fail on o.a.http.conn.ConnectTimeoutException. (Mark Miller)
|
|
|
|
* SOLR-3875: Fixed index boosts on multi-valued fields when docBoost is used
|
|
(hossman)
|
|
|
|
* SOLR-3878: Exception when using open-ended range query with CurrencyField (janhoy)
|
|
|
|
* SOLR-3891: CacheValue in CachingDirectoryFactory cannot be used outside of
|
|
solr.core package. (phunt via Mark Miller)
|
|
|
|
* SOLR-3892: Inconsistent locking when accessing cache in CachingDirectoryFactory
|
|
from RAMDirectoryFactory and MockDirectoryFactory. (phunt via Mark Miller)
|
|
|
|
* SOLR-3883: Distributed indexing forwards non-applicable request params.
|
|
(Dan Sutton, Per Steffensen, yonik, Mark Miller)
|
|
|
|
* SOLR-3903: Fixed MissingFormatArgumentException in ConcurrentUpdateSolrServer
|
|
(hossman)
|
|
|
|
* SOLR-3916: Fixed whitespace bug in parsing the fl param (hossman)
|
|
|
|
Other Changes
|
|
----------------------
|
|
|
|
* SOLR-3690: Fixed binary release packages to include dependencie needed for
|
|
the solr-test-framework (hossman)
|
|
|
|
* SOLR-2857: The /update/json and /update/csv URLs were restored to aid
|
|
in the migration of existing clients. (yonik)
|
|
|
|
* SOLR-3691: SimplePostTool: Mode for crawling/posting web pages
|
|
See http://wiki.apache.org/solr/ExtractingRequestHandler for examples (janhoy)
|
|
|
|
* SOLR-3707: Upgrade Solr to Tika 1.2 (janhoy)
|
|
|
|
* SOLR-2747: Updated changes2html.pl to handle Solr's CHANGES.txt; added
|
|
target 'changes-to-html' to solr/build.xml.
|
|
(Steve Rowe, Robert Muir)
|
|
|
|
* SOLR-3752: When a leader goes down, have the Overseer clear the leader state
|
|
in cluster.json (Mark Miller)
|
|
|
|
* SOLR-3751: Add defensive checks for SolrCloud updates and requests that ensure
|
|
the local state matches what we can tell the request expected. (Mark Miller)
|
|
|
|
* SOLR-3773: Hash based on the external String id rather than the indexed
|
|
representation for distributed updates. (Michael Garski, yonik, Mark Miller)
|
|
|
|
* SOLR-3780: Maven build: Make solrj tests run separately from solr-core.
|
|
(Steve Rowe)
|
|
|
|
* SOLR-3772: Optionally, on cluster startup, we can wait until we see all registered
|
|
replicas before running the leader process - or if they all do not come up,
|
|
N amount of time. (Jan Høydahl, Per Steffensen, Mark Miller)
|
|
|
|
* SOLR-3750: Optionaly, on session expiration, we can explicitly wait some time before
|
|
running the leader sync process so that we are sure every node participates.
|
|
(Per Steffensen, Mark Miller)
|
|
|
|
* SOLR-3824: Velocity: Error messages from search not displayed (janhoy)
|
|
|
|
* SOLR-3826: Test framework improvements for specifying coreName on initCore
|
|
(Amit Nithian, hossman)
|
|
|
|
* SOLR-3749: Allow default UpdateLog syncLevel to be configured by
|
|
solrconfig.xml (Raintung Li, Mark Miller)
|
|
|
|
* SOLR-3845: Rename numReplicas to replicationFactor in Collections API.
|
|
(yonik, Mark Miller)
|
|
|
|
* SOLR-3815: SolrCloud - Add properties such as "range" to shards, which changes
|
|
the clusterstate.json and puts the shard replicas under "replicas". (yonik)
|
|
|
|
* SOLR-3871: SyncStrategy should use an executor for the threads it creates to
|
|
request recoveries. (Mark Miller)
|
|
|
|
* SOLR-3870: SyncStrategy should have a close so it can abort earlier on
|
|
shutdown. (Mark Miller)
|
|
|
|
|
|
================== 4.0.0-BETA ===================
|
|
|
|
|
|
Versions of Major Components
|
|
---------------------
|
|
Apache Tika 1.1
|
|
Carrot2 3.5.0
|
|
Velocity 1.6.4 and Velocity Tools 2.0
|
|
Apache UIMA 2.3.1
|
|
Apache ZooKeeper 3.3.6
|
|
|
|
Upgrading from Solr 4.0.0-ALPHA
|
|
----------------------
|
|
|
|
Solr is now much more strict about requiring that the uniqueKeyField feature
|
|
(if used) must refer to a field which is not multiValued. If you upgrade from
|
|
an earlier version of Solr and see an error that your uniqueKeyField "can not
|
|
be configured to be multivalued" please add 'multiValued="false"' to the
|
|
<field /> declaration for your uniqueKeyField. See SOLR-3682 for more details.
|
|
|
|
Detailed Change List
|
|
----------------------
|
|
|
|
New Features
|
|
----------------------
|
|
|
|
* LUCENE-4201: Added JapaneseIterationMarkCharFilterFactory to normalize Japanese
|
|
iteration marks. (Robert Muir, Christian Moen)
|
|
|
|
* SOLR-1856: In Solr Cell, literals should override Tika-parsed values.
|
|
Patch adds a param "literalsOverride" which defaults to true, but can be set
|
|
to "false" to let Tika-parsed values be appended to literal values (Chris Harris, janhoy)
|
|
|
|
* SOLR-3488: Added a Collection management API for SolrCloud.
|
|
(Tommaso Teofili, Sami Siren, yonik, Mark Miller)
|
|
|
|
* SOLR-3559: Full deleteByQuery support with SolrCloud distributed indexing.
|
|
All replicas of a shard will be consistent, even if updates arrive in a
|
|
different order on different replicas. (yonik)
|
|
|
|
* SOLR-1929: Index encrypted documents with ExtractingUpdateRequestHandler.
|
|
By supplying resource.password=<mypw> or specifying an external file with regular
|
|
expressions matching file names, Solr will decrypt and index PDFs and DOCX formats.
|
|
(janhoy, Yiannis Pericleous)
|
|
|
|
* SOLR-3562: Add options to remove instance dir or data dir on core unload.
|
|
(Mark Miller, Per Steffensen)
|
|
|
|
* SOLR-2702: The default directory factory was changed to NRTCachingDirectoryFactory
|
|
which wraps the StandardDirectoryFactory and caches small files for improved
|
|
Near Real-time (NRT) performance. (Mark Miller, yonik)
|
|
|
|
* SOLR-2616: Include a sample java util logging configuration file.
|
|
(David Smiley, Mark Miller)
|
|
|
|
* SOLR-3460: Add cloud-scripts directory and a zkcli.sh|bat tool for easy scripting
|
|
and interaction with ZooKeeper. (Mark Miller)
|
|
|
|
* SOLR-1725: StatelessScriptUpdateProcessorFactory allows users to implement
|
|
the full ScriptUpdateProcessor API using any scripting language with a
|
|
javax.script.ScriptEngineFactory
|
|
(Uri Boness, ehatcher, Simon Rosenthal, hossman)
|
|
|
|
* SOLR-139: Change to updateable documents to create the document if it doesn't
|
|
already exist. To assert that the document must exist, use the optimistic
|
|
concurrency feature by specifying a _version_ of 1. (yonik)
|
|
|
|
* LUCENE-2510, LUCENE-4044: Migrated Solr's Tokenizer-, TokenFilter-, and
|
|
CharFilterFactories to the lucene-analysis module. To add new analysis
|
|
modules to Solr (like ICU, SmartChinese, Morfologik,...), just drop in
|
|
the JAR files from Lucene's binary distribution into your Solr instance's
|
|
lib folder. The factories are automatically made available with SPI.
|
|
(Chris Male, Robert Muir, Uwe Schindler)
|
|
|
|
* SOLR-3634, SOLR-3635: CoreContainer and CoreAdminHandler will now remember
|
|
and report back information about failures to initialize SolrCores. These
|
|
failures will be accessible from the web UI and CoreAdminHandler STATUS
|
|
command until they are "reset" by creating/renaming a SolrCore with the
|
|
same name. (hossman, steffkes)
|
|
|
|
* SOLR-1280: Added commented-out example of the new script update processor
|
|
to the example configuration. See http://wiki.apache.org/solr/ScriptUpdateProcessor (ehatcher)
|
|
|
|
* SOLR-3672: SimplePostTool: Improvements for posting files
|
|
Support for auto mode, recursive and wildcards (janhoy)
|
|
|
|
Optimizations
|
|
----------------------
|
|
|
|
* SOLR-3708: Add hashCode to ClusterState so that structures built based on the
|
|
ClusterState can be easily cached. (Mark Miller)
|
|
|
|
* SOLR-3709: Cache the url list created from the ClusterState in CloudSolrServer on each
|
|
request. (Mark Miller, yonik)
|
|
|
|
* SOLR-3710: Change CloudSolrServer so that update requests are only sent to leaders by
|
|
default. (Mark Miller)
|
|
|
|
Bug Fixes
|
|
----------------------
|
|
|
|
* SOLR-3582: Our ZooKeeper watchers respond to session events as if they are change events,
|
|
creating undesirable side effects. (Trym R. Møller, Mark Miller)
|
|
|
|
* SOLR-3467: ExtendedDismax escaping is missing several reserved characters
|
|
(Michael Dodsworth via janhoy)
|
|
|
|
* SOLR-3587: After reloading a SolrCore, the original Analyzer is still used rather than a new
|
|
one. (Alexey Serba, yonik, rmuir, Mark Miller)
|
|
|
|
* LUCENE-4185: Fix a bug where CharFilters were wrongly being applied twice. (Michael Froh, rmuir)
|
|
|
|
* SOLR-3610: After reloading a core, indexing would fail on any newly added fields to the schema. (Brent Mills, rmuir)
|
|
|
|
* SOLR-3377: edismax fails to correctly parse a fielded query wrapped by parens.
|
|
This regression was introduced in 3.6. (Bernd Fehling, Jan Høydahl, yonik)
|
|
|
|
* SOLR-3621: Fix rare concurrency issue when opening a new IndexWriter for replication or rollback.
|
|
(Mark Miller)
|
|
|
|
* SOLR-1781: Replication index directories not always cleaned up.
|
|
(Markus Jelsma, Terje Sten Bjerkseth, Mark Miller)
|
|
|
|
* SOLR-3639: Update ZooKeeper to 3.3.6 for a variety of bug fixes. (Mark Miller)
|
|
|
|
* SOLR-3629: Typo in solr.xml persistence when overriding the solrconfig.xml
|
|
file name using the "config" attribute prevented the override file from being
|
|
used. (Ryan Zezeski, hossman)
|
|
|
|
* SOLR-3642: Correct broken check for multivalued fields in stats.facet
|
|
(Yandong Yao, hossman)
|
|
|
|
* SOLR-3660: Velocity: Link to admin page broken (janhoy)
|
|
|
|
* SOLR-3658: Adding thousands of docs with one UpdateProcessorChain instance can briefly create
|
|
spikes of threads in the thousands. (yonik, Mark Miller)
|
|
|
|
* SOLR-3656: A core reload now always uses the same dataDir. (Mark Miller, yonik)
|
|
|
|
* SOLR-3662: Core reload bugs: a reload always obtained a non-NRT searcher, which
|
|
could go back in time with respect to the previous core's NRT searcher. Versioning
|
|
did not work correctly across a core reload, and update handler synchronization
|
|
was changed to synchronize on core state since more than on update handler
|
|
can coexist for a single index during a reload. (yonik)
|
|
|
|
* SOLR-3663: There are a couple of bugs in the sync process when a leader goes down and a
|
|
new leader is elected. (Mark Miller)
|
|
|
|
* SOLR-3623: Fixed inconsistent treatment of third-party dependencies for
|
|
solr contribs analysis-extras & uima (hossman)
|
|
|
|
* SOLR-3652: Fixed range faceting to error instead of looping infinitely
|
|
when 'gap' is zero -- or effectively zero due to floating point arithmetic
|
|
underflow. (hossman)
|
|
|
|
* SOLR-3648: Fixed VelocityResponseWriter template loading in SolrCloud mode.
|
|
For the example configuration, this means /browse now works with SolrCloud.
|
|
(janhoy, ehatcher)
|
|
|
|
* SOLR-3677: Fixed missleading error message in web ui to distinguish between
|
|
no SolrCores loaded vs. no /admin/ handler available.
|
|
(hossman, steffkes)
|
|
|
|
* SOLR-3428: SolrCmdDistributor flushAdds/flushDeletes can cause repeated
|
|
adds/deletes to be sent (Mark Miller, Per Steffensen)
|
|
|
|
* SOLR-3647: DistributedQueue should use our Solr zk client rather than the std zk
|
|
client. ZooKeeper expiration can be permanent otherwise. (Mark Miller)
|
|
|
|
Other Changes
|
|
----------------------
|
|
|
|
* SOLR-3524: Make discarding punctuation configurable in JapaneseTokenizerFactory.
|
|
The default is to discard punctuation, but this is overridable as an expert option.
|
|
(Kazuaki Hiraga, Jun Ohtani via Christian Moen)
|
|
|
|
* SOLR-1770: Move the default core instance directory into a collection1 folder.
|
|
(Mark Miller)
|
|
|
|
* SOLR-3355: Add shard and collection to SolrCore statistics. (Michael Garski, Mark Miller)
|
|
|
|
* SOLR-3575: solr.xml should default to persist=true (Mark Miller)
|
|
|
|
* SOLR-3563: Unloading all cores in a SolrCloud collection will now cause the removal of
|
|
that collection's meta data from ZooKeeper. (Mark Miller, Per Steffensen)
|
|
|
|
* SOLR-3599: Add zkClientTimeout to solr.xml so that it's obvious how to change it and so
|
|
that you can change it with a system property. (Mark Miller)
|
|
|
|
* SOLR-3609: Change Solr's expanded webapp directory to be at a consistent path called
|
|
solr-webapp rather than a temporary directory. (Mark Miller)
|
|
|
|
* SOLR-3600: Raise the default zkClientTimeout from 10 seconds to 15 seconds. (Mark Miller)
|
|
|
|
* SOLR-3215: Clone SolrInputDocument when distrib indexing so that update processors after
|
|
the distrib update process do not process the document twice. (Mark Miller)
|
|
|
|
* SOLR-3683: Improved error handling if an <analyzer> contains both an
|
|
explicit class attribute, as well as nested factories. (hossman)
|
|
|
|
* SOLR-3682: Fail to parse schema.xml if uniqueKeyField is multivalued (hossman)
|
|
|
|
* SOLR-2115: DIH no longer requires the "config" parameter to be specified in solrconfig.xml.
|
|
Instead, the configuration is loaded and parsed with every import. This allows the use of
|
|
a different configuration with each import, and makes correcting configuration errors simpler.
|
|
Also, the configuration itself can be passed using the "dataConfig" parameter rather than
|
|
using a file (this previously worked in debug mode only). When configuration errors are
|
|
encountered, the error message is returned in XML format. (James Dyer)
|
|
|
|
* SOLR-3439: Make SolrCell easier to use out of the box. Also improves "/browse" to display
|
|
rich-text documents correctly, along with facets for author and content_type.
|
|
With the new "content" field, highlighting of body is supported. See also SOLR-3672 for
|
|
easier posting of a whole directory structure. (Jack Krupansky, janhoy)
|
|
|
|
* SOLR-3579: SolrCloud view should default to the graph view rather than tree view.
|
|
(steffkes, Mark Miller)
|
|
|
|
================== 4.0.0-ALPHA ==================
|
|
More information about this release, including any errata related to the
|
|
release notes, upgrade instructions, or other changes may be found online at:
|
|
https://wiki.apache.org/solr/Solr4.0
|
|
|
|
|
|
Versions of Major Components
|
|
---------------------
|
|
Apache Tika 1.1
|
|
Carrot2 3.5.0
|
|
Velocity 1.6.4 and Velocity Tools 2.0
|
|
Apache UIMA 2.3.1
|
|
Apache ZooKeeper 3.3.4
|
|
|
|
|
|
Upgrading from Solr 3.6-dev
|
|
----------------------
|
|
|
|
* The Lucene index format has changed and as a result, once you upgrade,
|
|
previous versions of Solr will no longer be able to read your indices.
|
|
In a master/slave configuration, all searchers/slaves should be upgraded
|
|
before the master. If the master were to be updated first, the older
|
|
searchers would not be able to read the new index format.
|
|
|
|
* Setting abortOnConfigurationError=false is no longer supported
|
|
(since it has never worked properly). Solr will now warn you if
|
|
you attempt to set this configuration option at all. (see SOLR-1846)
|
|
|
|
* The default logic for the 'mm' param of the 'dismax' QParser has
|
|
been changed. If no 'mm' param is specified (either in the query,
|
|
or as a default in solrconfig.xml) then the effective value of the
|
|
'q.op' param (either in the query or as a default in solrconfig.xml
|
|
or from the 'defaultOperator' option in schema.xml) is used to
|
|
influence the behavior. If q.op is effectively "AND" then mm=100%.
|
|
If q.op is effectively "OR" then mm=0%. Users who wish to force the
|
|
legacy behavior should set a default value for the 'mm' param in
|
|
their solrconfig.xml file.
|
|
|
|
* The VelocityResponseWriter is no longer built into the core. Its JAR and
|
|
dependencies now need to be added (via <lib> or solr/home lib inclusion),
|
|
and it needs to be registered in solrconfig.xml like this:
|
|
<queryResponseWriter name="velocity" class="solr.VelocityResponseWriter"/>
|
|
|
|
* The update request parameter to choose Update Request Processor Chain is
|
|
renamed from "update.processor" to "update.chain". The old parameter was
|
|
deprecated but still working since Solr3.2, but is now removed
|
|
entirely.
|
|
|
|
* The <indexDefaults> and <mainIndex> sections of solrconfig.xml are discontinued
|
|
and replaced with the <indexConfig> section. There are also better defaults.
|
|
When migrating, if you don't know what your old settings mean, simply delete
|
|
both <indexDefaults> and <mainIndex> sections. If you have customizations,
|
|
put them in <indexConfig> section - with same syntax as before.
|
|
|
|
* Two of the SolrServer subclasses in SolrJ were renamed/replaced.
|
|
CommonsHttpSolrServer is now HttpSolrServer, and
|
|
StreamingUpdateSolrServer is now ConcurrentUpdateSolrServer.
|
|
|
|
* The PingRequestHandler no longer looks for a <healthcheck/> option in the
|
|
(legacy) <admin> section of solrconfig.xml. Users who wish to take
|
|
advantage of this feature should configure a "healthcheckFile" init param
|
|
directly on the PingRequestHandler. As part of this change, relative file
|
|
paths have been fixed to be resolved against the data dir. See the example
|
|
solrconfig.xml and SOLR-1258 for more details.
|
|
|
|
* Due to low level changes to support SolrCloud, the uniqueKey field can no
|
|
longer be populated via <copyField/> or <field default=...> in the
|
|
schema.xml. Users wishing to have Solr automatically generate a uniqueKey
|
|
value when adding documents should instead use an instance of
|
|
solr.UUIDUpdateProcessorFactory in their update processor chain. See
|
|
SOLR-2796 for more details.
|
|
|
|
|
|
Detailed Change List
|
|
----------------------
|
|
|
|
New Features
|
|
----------------------
|
|
|
|
* SOLR-3272: Solr filter factory for MorfologikFilter (Polish lemmatisation).
|
|
(Rafał Kuć via Dawid Weiss, Steven Rowe, Uwe Schindler).
|
|
|
|
* SOLR-571: The autowarmCount for LRUCaches (LRUCache and FastLRUCache) now
|
|
supports "percentages" which get evaluated relative the current size of
|
|
the cache when warming happens.
|
|
(Tomas Fernandez Lobbe and hossman)
|
|
|
|
* SOLR-1932: New relevancy function queries: termfreq, tf, docfreq, idf
|
|
norm, maxdoc, numdocs. (yonik)
|
|
|
|
* SOLR-1665: Add debug component options for timings, results and query info only (gsingers, hossman, yonik)
|
|
|
|
* SOLR-2112: Solrj API now supports streaming results. (ryan)
|
|
|
|
* SOLR-792: Adding PivotFacetComponent for Hierarchical faceting
|
|
(ehatcher, Jeremy Hinegardner, Thibaut Lassalle, ryan)
|
|
|
|
* LUCENE-2507, SOLR-2571, SOLR-2576: Added DirectSolrSpellChecker, which uses Lucene's
|
|
DirectSpellChecker to retrieve correction candidates directly from the term dictionary using
|
|
levenshtein automata. (James Dyer, rmuir)
|
|
|
|
* SOLR-1873, SOLR-2358: SolrCloud - added shared/central config and core/shard management via zookeeper,
|
|
built-in load balancing, and distributed indexing.
|
|
(Jamie Johnson, Sami Siren, Ted Dunning, yonik, Mark Miller)
|
|
Additional Work:
|
|
- SOLR-2324: SolrCloud solr.xml parameters are not persisted by CoreContainer.
|
|
(Massimo Schiavon, Mark Miller)
|
|
- SOLR-2287: Allow users to query by multiple, compatible collections with SolrCloud.
|
|
(Soheb Mahmood, Alex Cowell, Mark Miller)
|
|
- SOLR-2622: ShowFileRequestHandler does not work in SolrCloud mode.
|
|
(Stefan Matheis, Mark Miller)
|
|
- SOLR-3108: Error in SolrCloud's replica lookup code when replica's are hosted in same Solr instance.
|
|
(Bruno Dumon, Sami Siren, Mark Miller)
|
|
- SOLR-3080: Remove shard info from zookeeper when SolrCore is explicitly unloaded.
|
|
(yonik, Mark Miller, siren)
|
|
- SOLR-3437: Recovery issues a spurious commit to the cluster. (Trym R. Møller via Mark Miller)
|
|
- SOLR-2822: Skip update processors already run on other nodes (hossman)
|
|
|
|
* SOLR-1566: Transforming documents in the ResponseWriters. This will allow
|
|
for more complex results in responses and open the door for function queries
|
|
as results.
|
|
(ryan with patches from grant, noble, cmale, yonik, Jan Høydahl,
|
|
Arul Kalaipandian, Luca Cavanna, hossman)
|
|
- SOLR-2037: Thanks to SOLR-1566, documents boosted by the QueryElevationComponent
|
|
can be marked as boosted. (gsingers, ryan, yonik)
|
|
|
|
* SOLR-2396: Add CollationField, which is much more efficient than
|
|
the Solr 3.x CollationKeyFilterFactory, and also supports
|
|
Locale-sensitive range queries. (rmuir)
|
|
|
|
* SOLR-2338: Add support for using <similarity/> in a schema's fieldType,
|
|
for customizing scoring on a per-field basis. (hossman, yonik, rmuir)
|
|
|
|
* SOLR-2335: New 'field("...")' function syntax for referring to complex
|
|
field names (containing whitespace or special characters) in functions.
|
|
|
|
* SOLR-2383: /browse improvements: generalize range and date facet display
|
|
(Jan Høydahl via yonik)
|
|
|
|
* SOLR-2272: Pseudo-join queries / filters. Examples:
|
|
- To restrict to the set of parents with at least one blue-eyed child:
|
|
fq={!join from=parent to=name}eyes:blue
|
|
- To restrict to the set of children with at least one blue-eyed parent:
|
|
fq={!join from=name to=parent}eyes:blue
|
|
(yonik)
|
|
|
|
* SOLR-1942: Added the ability to select postings format per fieldType in schema.xml
|
|
as well as support custom Codecs in solrconfig.xml.
|
|
(simonw via rmuir)
|
|
|
|
* SOLR-2136: Boolean type added to function queries, along with
|
|
new functions exists(), if(), and(), or(), xor(), not(), def(),
|
|
and true and false constants. (yonik)
|
|
|
|
* SOLR-2491: Add support for using spellcheck collation in conjunction
|
|
with grouping. Note that the number of hits returned for collations
|
|
is the number of ungrouped hits. (James Dyer via rmuir)
|
|
|
|
* SOLR-1298: Return FunctionQuery as pseudo field. The solr 'fl' param
|
|
now supports functions. For example: fl=id,sum(x,y) -- NOTE: only
|
|
functions with fast random access are reccomended. (yonik, ryan)
|
|
|
|
* SOLR-705: Optionally return shard info with each document in distributed
|
|
search. Use fl=id,[shard] to return the shard url. (ryan)
|
|
|
|
* SOLR-2417: Add explain info directly to return documents using
|
|
?fl=id,[explain] (ryan)
|
|
|
|
* SOLR-2533: Converted ValueSource.ValueSourceSortField over to new rewriteable Lucene
|
|
SortFields. ValueSourceSortField instances must be rewritten before they can be used.
|
|
This is done by SolrIndexSearcher when necessary. (Chris Male).
|
|
|
|
* SOLR-2193, SOLR-2565: You may now specify a 'soft' commit when committing. This will
|
|
use Lucene's NRT feature to avoid guaranteeing documents are on stable storage in exchange
|
|
for faster reopen times. There is also a new 'soft' autocommit tracker that can be
|
|
configured. (Mark Miller, Robert Muir)
|
|
|
|
* SOLR-2399: Updated Solr Admin interface. New look and feel with per core administration
|
|
and many new options. (Stefan Matheis via ryan)
|
|
|
|
* SOLR-1032: CSV handler now supports "literal.field_name=value" parameters.
|
|
(Simon Rosenthal, ehatcher)
|
|
|
|
* SOLR-2656: realtime-get, efficiently retrieves the latest stored fields for specified
|
|
documents, even if they are not yet searchable (i.e. without reopening a searcher)
|
|
(yonik)
|
|
|
|
* SOLR-2703: Added support for Lucene's "surround" query parser. (Simon Rosenthal, ehatcher)
|
|
|
|
* SOLR-2754: Added factories for several ranking algorithms:
|
|
- BM25SimilarityFactory: Okapi BM25
|
|
- DFRSimilarityFactory: Divergence from Randomness models
|
|
- IBSimilarityFactory: Information-based models
|
|
- LMDirichletSimilarity: LM with Dirichlet smoothing
|
|
- LMJelinekMercerSimilarity: LM with Jelinek-Mercer smoothing
|
|
(David Mark Nemeskey, Robert Muir)
|
|
|
|
* SOLR-2134 Trie* fields should support sortMissingLast=true, and deprecate Sortable* Field Types
|
|
(Ryan McKinley, Mike McCandless, Uwe Schindler, Erick Erickson)
|
|
|
|
* SOLR-2438 added MultiTermAwareComponent to the various classes to allow automatic lowercasing
|
|
for multiterm queries (wildcards, regex, prefix, range, etc). You can now optionally specify a
|
|
"multiterm" analyzer in our schema.xml, but Solr should "do the right thing" if you don't
|
|
specify <analyzer type="multiterm"> (Pete Sturge Erick Erickson, Mentoring from Seeley and Muir)
|
|
|
|
* SOLR-2481: Add support for commitWithin in DataImportHandler (Sami Siren via yonik)
|
|
|
|
* SOLR-2992: Add support for IndexWriter.prepareCommit() via prepareCommit=true
|
|
on update URLs. (yonik)
|
|
|
|
* SOLR-2906: Added LFU cache options to Solr. (Shawn Heisey via Erick Erickson)
|
|
|
|
* SOLR-3069: Ability to add openSearcher=false to not open a searcher when doing
|
|
a hard commit. commitWithin now only invokes a softCommit. (yonik)
|
|
|
|
* SOLR-2802: New FieldMutatingUpdateProcessor and Factory to simplify the
|
|
development of UpdateProcessors that modify field values of documents as
|
|
they are indexed. Also includes several useful new implementations:
|
|
- RemoveBlankFieldUpdateProcessorFactory
|
|
- TrimFieldUpdateProcessorFactory
|
|
- HTMLStripFieldUpdateProcessorFactory
|
|
- RegexReplaceProcessorFactory
|
|
- FieldLengthUpdateProcessorFactory
|
|
- ConcatFieldUpdateProcessorFactory
|
|
- FirstFieldValueUpdateProcessorFactory
|
|
- LastFieldValueUpdateProcessorFactory
|
|
- MinFieldValueUpdateProcessorFactory
|
|
- MaxFieldValueUpdateProcessorFactory
|
|
- TruncateFieldUpdateProcessorFactory
|
|
- IgnoreFieldUpdateProcessorFactory
|
|
(hossman, janhoy)
|
|
|
|
* SOLR-3120: Optional post filtering for spatial queries bbox and geofilt
|
|
for LatLonType. (yonik)
|
|
|
|
* SOLR-2459: Expose LogLevel selection with a RequestHandler rather then servlet
|
|
(Stefan Matheis, Upayavira, ryan)
|
|
|
|
* SOLR-3134: Include shard info in distributed response when shards.info=true
|
|
(Russell Black, ryan)
|
|
|
|
* SOLR-2898: Support grouped faceting. (Martijn van Groningen)
|
|
Additional Work:
|
|
- SOLR-3406: Extended grouped faceting support to facet.query and facet.range parameters.
|
|
(David Boychuck, Martijn van Groningen)
|
|
|
|
* SOLR-2949: QueryElevationComponent is now supported with distributed search.
|
|
(Mark Miller, yonik)
|
|
|
|
* SOLR-3221: Added the ability to directly configure aspects of the concurrency
|
|
and thread-pooling used within distributed search in solr. This allows for finer
|
|
grained controlled and can be tuned by end users to target their own specific
|
|
requirements. This builds on the work of the HttpCommComponent and uses the same configuration
|
|
block to configure the thread pool. The default configuration has
|
|
the same behaviour as solr 3.5, favouring throughput over latency. More
|
|
information can be found on the wiki (http://wiki.apache.org/solr/SolrConfigXml) (Greg Bowyer)
|
|
|
|
* SOLR-3278: Negative boost support to the Extended Dismax Query Parser Boost Query (bq).
|
|
(James Dyer)
|
|
|
|
* SOLR-3255: OpenExchangeRates.Org Exchange Rate Provider for CurrencyField (janhoy)
|
|
|
|
* SOLR-3358: Logging events are captured and available from the /admin/logging
|
|
request handler. (ryan)
|
|
|
|
* SOLR-1535: PreAnalyzedField type provides a functionality to index (and optionally store)
|
|
field content that was already processed and split into tokens using some external processing
|
|
chain. Serialization format is pluggable, and defaults to JSON. (ab)
|
|
|
|
* SOLR-3363: Consolidated Exceptions in Analysis Factories so they only throw
|
|
InitalizationExceptions (Chris Male)
|
|
|
|
* SOLR-2690: New support for a "TZ" request param which overrides the TimeZone
|
|
used when rounding Dates in DateMath expressions for the entire request
|
|
(all date range queries and date faceting is affected). The default TZ
|
|
is still UTC. (David Schlotfeldt, hossman)
|
|
|
|
* SOLR-3402: Analysis Factories are now configured with their Lucene Version
|
|
throw setLuceneMatchVersion, rather than through the Map passed to init.
|
|
Parsing and simple error checking for the Version is now done inside
|
|
the code that creates the Analysis Factories. (Chris Male)
|
|
|
|
* SOLR-3178: Optimistic locking. If a _version_ is provided with an update
|
|
that does not match the version in the index, an HTTP 409 error (Conflict)
|
|
will result. (Per Steffensen, yonik)
|
|
|
|
* SOLR-139: Updateable documents. JSON Example:
|
|
{"id":"mydoc", "f1":{"set":10}, "f2":{"add":20}} will result in field "f1"
|
|
being set to 10, "f2" having an additional value of 20 added, and all
|
|
other existing fields unchanged. All source fields must be stored for
|
|
this feature to work correctly. (Ryan McKinley, Erik Hatcher, yonik)
|
|
|
|
* SOLR-2857: Support XML,CSV,JSON, and javabin in a single RequestHandler and
|
|
choose the correct ContentStreamLoader based on Content-Type header. This
|
|
also deprecates the existing [Xml,JSON,CSV,Binary,Xslt]UpdateRequestHandler.
|
|
(ryan)
|
|
|
|
* SOLR-2585: Context-Sensitive Spelling Suggestions & Collations. This adds support
|
|
for the "spellcheck.alternativeTermCount" & "spellcheck.maxResultsForSuggest"
|
|
parameters, letting users receive suggestions even when all the queried terms
|
|
exist in the dictionary. This differs from "spellcheck.onlyMorePopular" in
|
|
that the suggestions need not consist entirely of terms with a greater document
|
|
frequency than the queried terms. (James Dyer)
|
|
|
|
* SOLR-2058: Edismax query parser to allow "phrase slop" to be specified per-field
|
|
on the pf/pf2/pf3 parameters using optional "FieldName~slop^boost" syntax. The
|
|
prior "FieldName^boost" syntax is still accepted. In such cases the value on the
|
|
"ps" parameter serves as the default slop. (Ron Mayer via James Dyer)
|
|
|
|
* SOLR-3495: New UpdateProcessors have been added to create default values for
|
|
configured fields. These works similarly to the <field default="..."/>
|
|
option in schema.xml, but are applied in the UpdateProcessorChain, so they
|
|
may be used prior to other UpdateProcessors, or to generate a uniqueKey field
|
|
value when using the DistributedUpdateProcessor (ie: SolrCloud)
|
|
TimestampUpdateProcessorFactory
|
|
UUIDUpdateProcessorFactory
|
|
DefaultValueUpdateProcessorFactory
|
|
(hossman)
|
|
|
|
* SOLR-2993: Add WordBreakSolrSpellChecker to offer suggestions by combining adjacent
|
|
query terms and/or breaking terms into multiple words. This spellchecker can be
|
|
configured with a traditional checker (ie: DirectSolrSpellChecker). The results
|
|
are combined and collations can contain a mix of corrections from both spellcheckers.
|
|
(James Dyer)
|
|
|
|
* SOLR-3508: Simplify JSON update format for deletes as well as allow
|
|
version specification for optimistic locking. Examples:
|
|
- {"delete":"myid"}
|
|
- {"delete":["id1","id2","id3"]}
|
|
- {"delete":{"id":"myid", "_version_":123456789}}
|
|
(yonik)
|
|
|
|
* SOLR-3211: Allow parameter overrides in conjunction with "spellcheck.maxCollationTries".
|
|
To do so, use parameters starting with "spellcheck.collateParam." For instance, to
|
|
override the "mm" parameter, specify "spellcheck.collateParam.mm". This is helpful
|
|
in cases where testing spellcheck collations for result counts should use different
|
|
parameters from the main query (James Dyer)
|
|
|
|
* SOLR-2599: CloneFieldUpdateProcessorFactory provides similar functionality
|
|
to schema.xml's <copyField/> declaration but as an update processor that can
|
|
be combined with other processors in any order. (Jan Høydahl & hossman)
|
|
|
|
* SOLR-3351: eDismax: ps2 and ps3 params (janhoy)
|
|
|
|
* SOLR-3542: Add WeightedFragListBuilder for FVH and set it to default fragListBuilder
|
|
in example solrconfig.xml. (Sebastian Lutze, koji)
|
|
|
|
* SOLR-2396: Add ICUCollationField to contrib/analysis-extras, which is much
|
|
more efficient than the Solr 3.x ICUCollationKeyFilterFactory, and also
|
|
supports Locale-sensitive range queries. (rmuir)
|
|
|
|
|
|
Optimizations
|
|
----------------------
|
|
|
|
* SOLR-1875: Per-segment field faceting for single valued string fields.
|
|
Enable with facet.method=fcs, control the number of threads used with
|
|
the "threads" local param on the facet.field param. This algorithm will
|
|
only be faster in the presence of rapid index changes. (yonik)
|
|
|
|
* SOLR-1904: When facet.enum.cache.minDf > 0 and the base doc set is a
|
|
SortedIntSet, convert to HashDocSet for better performance. (yonik)
|
|
|
|
* SOLR-2092: Speed up single-valued and multi-valued "fc" faceting. Typical
|
|
improvement is 5%, but can be much greater (up to 10x faster) when facet.offset
|
|
is very large (deep paging). (yonik)
|
|
|
|
* SOLR-2193, SOLR-2565: The default Solr update handler has been improved so
|
|
that it uses fewer locks, keeps the IndexWriter open rather than closing it
|
|
on each commit (ie commits no longer wait for background merges to complete),
|
|
works with SolrCore to provide faster 'soft' commits, and has an improved API
|
|
that requires less instanceof special casing. (Mark Miller, Robert Muir)
|
|
Additional Work:
|
|
- SOLR-2697: commit and autocommit operations don't reset
|
|
DirectUpdateHandler2.numDocsPending stats attribute.
|
|
(Alexey Serba, Mark Miller)
|
|
|
|
* SOLR-2950: The QueryElevationComponent now avoids using the FieldCache and looking up
|
|
every document id (gsingers, yonik)
|
|
|
|
Bug Fixes
|
|
----------------------
|
|
* SOLR-3139: Make ConcurrentUpdateSolrServer send UpdateRequest.getParams()
|
|
as HTTP request params (siren)
|
|
|
|
* SOLR-3165: Cannot use DIH in Solrcloud + Zookeeper (Alexey Serba,
|
|
Mark Miller, siren)
|
|
|
|
* SOLR-3068: Occasional NPE in ThreadDumpHandler (siren)
|
|
|
|
* SOLR-2762: FSTLookup could return duplicate results or one results less
|
|
than requested. (David Smiley, Dawid Weiss)
|
|
|
|
* SOLR-2741: Bugs in facet range display in trunk (janhoy)
|
|
|
|
* SOLR-1908: Fixed SignatureUpdateProcessor to fail to initialize on
|
|
invalid config. Specifically: a signatureField that does not exist,
|
|
or overwriteDupes=true with a signatureField that is not indexed.
|
|
(hossman)
|
|
|
|
* SOLR-1824: IndexSchema will now fail to initialize if there is a
|
|
problem initializing one of the fields or field types. (hossman)
|
|
|
|
* SOLR-1928: TermsComponent didn't correctly break ties for non-text
|
|
fields sorted by count. (yonik)
|
|
|
|
* SOLR-2107: MoreLikeThisHandler doesn't work with alternate qparsers. (yonik)
|
|
|
|
* SOLR-2108: Fixed false positives when using wildcard queries on fields with reversed
|
|
wildcard support. For example, a query of *zemog* would match documents that contain
|
|
'gomez'. (Landon Kuhn via Robert Muir)
|
|
|
|
* SOLR-1962: SolrCore#initIndex should not use a mix of indexPath and newIndexPath (Mark Miller)
|
|
|
|
* SOLR-2275: fix DisMax 'mm' parsing to be tolerant of whitespace
|
|
(Erick Erickson via hossman)
|
|
|
|
* SOLR-2193, SOLR-2565, SOLR-2651: SolrCores now properly share IndexWriters across SolrCore reloads.
|
|
(Mark Miller, Robert Muir)
|
|
Additional Work:
|
|
- SOLR-2705: On reload, IndexWriterProvider holds onto the initial SolrCore it was created with.
|
|
(Yury Kats, Mark Miller)
|
|
|
|
* SOLR-2682: Remove addException() in SimpleFacet. FacetComponent no longer catches and embeds
|
|
exceptions occurred during facet processing, it throws HTTP 400 or 500 exceptions instead. (koji)
|
|
|
|
* SOLR-2654: Directorys used by a SolrCore are now closed when they are no longer used.
|
|
(Mark Miller)
|
|
|
|
* SOLR-2854: Now load URL content stream data (via stream.url) when called for during request handling,
|
|
rather than loading URL content streams automatically regardless of use.
|
|
(David Smiley and Ryan McKinley via ehatcher)
|
|
|
|
* SOLR-2829: Fix problem with false-positives due to incorrect
|
|
equals methods. (Yonik Seeley, Hossman, Erick Erickson.
|
|
Marc Tinnemeyer caught the bug)
|
|
|
|
* SOLR-2848: Removed 'instanceof AbstractLuceneSpellChecker' hacks from distributed spellchecking code,
|
|
and added a merge() method to SolrSpellChecker instead. Previously if you extended SolrSpellChecker
|
|
your spellchecker would not work in distributed fashion. (James Dyer via rmuir)
|
|
|
|
* SOLR-2509: StringIndexOutOfBoundsException in the spellchecker collate when the term contains
|
|
a hyphen. (Thomas Gambier caught the bug, Steffen Godskesen did the patch, via Erick Erickson)
|
|
|
|
* SOLR-1730: Made it clearer when a core failed to load as well as better logging when the
|
|
QueryElevationComponent fails to properly initialize (gsingers)
|
|
|
|
* SOLR-1520: QueryElevationComponent now supports non-string ids (gsingers)
|
|
|
|
* SOLR-3037: When using binary format in solrj the codec screws up parameters
|
|
(Sami Siren, Jörg Maier via yonik)
|
|
|
|
* SOLR-3062: A join in the main query was not respecting any filters pushed
|
|
down to it via acceptDocs since LUCENE-1536. (Mike Hugo, yonik)
|
|
|
|
* SOLR-3214: If you use multiple fl entries rather than a comma separated list, all but the first
|
|
entry can be ignored if you are using distributed search. (Tomas Fernandez Lobbe via Mark Miller)
|
|
|
|
* SOLR-3352: eDismax: pf2 should kick in for a query with 2 terms (janhoy)
|
|
|
|
* SOLR-3361: ReplicationHandler "maxNumberOfBackups" doesn't work if backups are triggered on commit
|
|
(James Dyer, Tomas Fernandez Lobbe)
|
|
|
|
* SOLR-2605: fixed tracking of the 'defaultCoreName' in CoreContainer so that
|
|
CoreAdminHandler could return consistent information regardless of wether
|
|
there is a a default core name or not. (steffkes, hossman)
|
|
|
|
* SOLR-3370: fixed CSVResponseWriter to respect globs in the 'fl' param
|
|
(Keith Fligg via hossman)
|
|
|
|
* SOLR-3436: Group count incorrect when not all shards are queried in the second
|
|
pass. (Francois Perron, Martijn van Groningen)
|
|
|
|
* SOLR-3454: Exception when using result grouping with main=true and using
|
|
wt=javabin. (Ludovic Boutros, Martijn van Groningen)
|
|
|
|
* SOLR-3446: Better errors when PatternTokenizerFactory is configured with
|
|
an invalid pattern, and include the 'name' whenever possible in plugin init
|
|
error messages. (hossman)
|
|
|
|
* LUCENE-4075: Cleaner path usage in TestXPathEntityProcessor
|
|
(Greg Bowyer via hossman)
|
|
|
|
* SOLR-2923: IllegalArgumentException when using useFilterForSortedQuery on an
|
|
empty index. (Adrien Grand via Mark Miller)
|
|
|
|
* SOLR-2352: Fixed TermVectorComponent so that it will not fail if the fl
|
|
param contains globs or psuedo-fields (hossman)
|
|
|
|
* SOLR-3541: add missing solrj dependencies to binary packages.
|
|
(Thijs Vonk via siren)
|
|
|
|
* SOLR-3522: fixed parsing of the 'literal()' function (hossman)
|
|
|
|
* SOLR-3548: Fixed a bug in the cachability of queries using the {!join}
|
|
parser or the strdist() function, as well as some minor improvements to
|
|
the hashCode implementation of {!bbox} and {!geofilt} queries.
|
|
(hossman)
|
|
|
|
* SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories
|
|
are respected now (Stanislaw Osinski, Dawid Weiss)
|
|
|
|
* SOLR-3430: Added a new DIH test against a real SQL database. Fixed problems
|
|
revealed by this new test related to the expanded cache support added to
|
|
3.6/SOLR-2382 (James Dyer)
|
|
|
|
* SOLR-1958: When using the MailEntityProcessor, import would fail if
|
|
fetchMailsSince was not specified. (Max Lynch via James Dyer)
|
|
|
|
|
|
Other Changes
|
|
----------------------
|
|
|
|
* SOLR-1846: Eliminate support for the abortOnConfigurationError
|
|
option. It has never worked very well, and in recent versions of
|
|
Solr hasn't worked at all. (hossman)
|
|
|
|
* SOLR-1889: The default logic for the 'mm' param of DismaxQParser and
|
|
ExtendedDismaxQParser has been changed to be determined based on the
|
|
effective value of the 'q.op' param (hossman)
|
|
|
|
* SOLR-1946: Misc improvements to the SystemInfoHandler: /admin/system
|
|
(hossman)
|
|
|
|
* SOLR-2289: Tweak spatial coords for example docs so they are a bit
|
|
more spread out (Erick Erickson via hossman)
|
|
|
|
* SOLR-2288: Small tweaks to eliminate compiler warnings. primarily
|
|
using Generics where applicable in method/object declatations, and
|
|
adding @SuppressWarnings("unchecked") when appropriate (hossman)
|
|
|
|
* SOLR-2375: Suggester Lookup implementations now store trie data
|
|
and load it back on init. This means that large tries don't have to be
|
|
rebuilt on every commit or core reload. (ab)
|
|
|
|
* SOLR-2413: Support for returning multi-valued fields w/o <arr> tag
|
|
in the XMLResponseWriter was removed. XMLResponseWriter only
|
|
no longer work with values less then 2.2 (ryan)
|
|
|
|
* SOLR-2423: FieldType argument changed from String to Object
|
|
Conversion from SolrInputDocument > Object > Fieldable is now managed
|
|
by FieldType rather then DocumentBuilder. (ryan)
|
|
|
|
* SOLR-2461: QuerySenderListener and AbstractSolrEventListener are
|
|
now public (hossman)
|
|
|
|
* LUCENE-2995: Moved some spellchecker and suggest APIs to modules/suggest:
|
|
HighFrequencyDictionary, SortedIterator, TermFreqIterator, and the
|
|
suggester APIs and implementations. (rmuir)
|
|
|
|
* SOLR-2576: Remove deprecated SpellingResult.add(Token, int).
|
|
(James Dyer via rmuir)
|
|
|
|
* LUCENE-3232: Moved MutableValue classes to new 'common' module. (Chris Male)
|
|
|
|
* LUCENE-2883: FunctionQuery, DocValues (and its impls), ValueSource (and its
|
|
impls) and BoostedQuery have been consolidated into the queries module. They
|
|
can now be found at o.a.l.queries.function.
|
|
|
|
* SOLR-2027: FacetField.getValues() now returns an empty list if there are no
|
|
values, instead of null (Chris Male)
|
|
|
|
* SOLR-1825: SolrQuery.addFacetQuery now enables facets automatically, like
|
|
addFacetField (Chris Male)
|
|
|
|
* SOLR-2663: FieldTypePluginLoader has been refactored out of IndexSchema
|
|
and made public. (hossman)
|
|
|
|
* SOLR-2331,SOLR-2691: Refactor CoreContainer's SolrXML serialization code and improve testing
|
|
(Yury Kats, hossman, Mark Miller)
|
|
|
|
* SOLR-2698: Enhance CoreAdmin STATUS command to return index size.
|
|
(Yury Kats, hossman, Mark Miller)
|
|
|
|
* SOLR-2654: The same Directory instance is now always used across a SolrCore so that
|
|
it's easier to add other DirectoryFactory's without static caching hacks.
|
|
(Mark Miller)
|
|
|
|
* LUCENE-3286: 'luke' ant target has been disabled due to incompatibilities with XML
|
|
queryparser location (Chris Male)
|
|
|
|
* SOLR-1897: The data dir from the core descriptor should override the data dir from
|
|
the solrconfig.xml rather than the other way round. (Mark Miller)
|
|
|
|
* SOLR-2756: Maven configuration: Excluded transitive stax:stax-api dependency
|
|
from org.codehaus.woodstox:wstx-asl dependency. (David Smiley via Steve Rowe)
|
|
|
|
* SOLR-2588: Moved VelocityResponseWriter back to contrib module in order to
|
|
remove it as a mandatory core dependency. (ehatcher)
|
|
|
|
* SOLR-2862: More explicit lexical resources location logged if Carrot2 clustering
|
|
extension is used. Fixed solr. impl. of IResource and IResourceLookup. (Dawid Weiss)
|
|
|
|
* SOLR-1123: Changed JSONResponseWriter to now use application/json as its Content-Type
|
|
by default. However the Content-Type can be overwritten and is set to text/plain in
|
|
the example configuration. (Uri Boness, Chris Male)
|
|
|
|
* SOLR-2607: Removed deprecated client/ruby directory, which included solr-ruby and flare.
|
|
(ehatcher)
|
|
|
|
* SOLR-3032: logOnce from SolrException logOnce and all the supporting
|
|
structure is gone. abortOnConfugrationError is also gone as it is no longer referenced.
|
|
Errors should be caught and logged at the top-most level or logged and NOT propagated up the
|
|
chain. (Erick Erickson)
|
|
|
|
* SOLR-2105: Remove support for deprecated "update.processor" (since 3.2), in favor of
|
|
"update.chain" (janhoy)
|
|
|
|
* SOLR-3005: Default QueryResponseWriters are now initialized via init() with an empty
|
|
NamedList. (Gasol Wu, Chris Male)
|
|
|
|
* SOLR-2607: Removed obsolete client/ folder (ehatcher, Eric Pugh, janhoy)
|
|
|
|
* SOLR-3202, SOLR-3244: Dropping Support for JSP. New Admin UI is all client side
|
|
(ryan, Aliaksandr Zhuhrou, Uwe Schindler)
|
|
|
|
* SOLR-3159: Upgrade example and tests to run with Jetty 8 (ryan)
|
|
|
|
* SOLR-3254: Upgrade Solr to Tika 1.1 (janhoy)
|
|
|
|
* SOLR-3329: Dropped getSourceID() from SolrInfoMBean and using
|
|
getClass().getPackage().getSpecificationVersion() for Version. (ryan)
|
|
|
|
* SOLR-3302: Upgraded SLF4j to version 1.6.4 (hossman)
|
|
|
|
* SOLR-3322: Add more context to IndexReaderFactory.newReader (ab)
|
|
|
|
* SOLR-3343: Moved FastWriter, FileUtils, RegexFileFilter, RTimer and SystemIdResolver
|
|
from org.apache.solr.common to org.apache.solr.util (Chris Male)
|
|
|
|
* SOLR-3357: ResourceLoader.newInstance now accepts a Class representation of the expected
|
|
instance type (Chris Male)
|
|
|
|
* SOLR-3388: HTTP caching is now disabled by default for RequestUpdateHandlers. (ryan)
|
|
|
|
* SOLR-3309: web.xml now specifies metadata-complete=true (which requires
|
|
Servlet 2.5) to prevent servlet containers from scanning class annotations
|
|
on startup. This allows for faster startup times on some servlet containers.
|
|
(Bill Bell, hossman)
|
|
|
|
* SOLR-1893: Refactored some common code from LRUCache and FastLRUCache into
|
|
SolrCacheBase (Tomás Fernández Löbbe via hossman)
|
|
|
|
* SOLR-3403: Deprecated Analysis Factories now log their own deprecation messages.
|
|
No logging support is provided by Factory parent classes. (Chris Male)
|
|
|
|
* SOLR-1258: PingRequestHandler is now directly configured with a
|
|
"healthcheckFile" instead of looking for the legacy
|
|
<admin><healthcheck/></admin> syntax. Filenames specified as relative
|
|
paths have been fixed so that they are resolved against the data dir
|
|
instead of the CWD of the java process. (hossman)
|
|
|
|
* SOLR-3083: JMX beans now report Numbers as numeric values rather then String
|
|
(Tagged Siteops, Greg Bowyer via ryan)
|
|
|
|
* SOLR-2796: Due to low level changes to support SolrCloud, the uniqueKey
|
|
field can no longer be populated via <copyField/> or <field default=...>
|
|
in the schema.xml.
|
|
|
|
* SOLR-3534: The Dismax and eDismax query parsers will fall back on the 'df' parameter
|
|
when 'qf' is absent. And if neither is present nor the schema default search field
|
|
then an exception will be thrown now. (dsmiley)
|
|
|
|
* SOLR-3262: The "threads" feature of DIH is removed (deprecated in Solr 3.6)
|
|
(James Dyer)
|
|
|
|
* SOLR-3422: Refactored DIH internal data classes. All entities in
|
|
data-config.xml must have a name (James Dyer)
|
|
|
|
Documentation
|
|
----------------------
|
|
|
|
* SOLR-2232: Improved README info on solr.solr.home in examples
|
|
(Eric Pugh and hossman)
|
|
|
|
================== 3.6.1 ==================
|
|
More information about this release, including any errata related to the
|
|
release notes, upgrade instructions, or other changes may be found online at:
|
|
https://wiki.apache.org/solr/Solr3.6.1
|
|
|
|
Bug Fixes
|
|
|
|
* LUCENE-3969: Throw IAE on bad arguments that could cause confusing errors in
|
|
PatternTokenizer. CommonGrams populates PositionLengthAttribute correctly.
|
|
(Uwe Schindler, Mike McCandless, Robert Muir)
|
|
|
|
* SOLR-3361: ReplicationHandler "maxNumberOfBackups" doesn't work if backups are triggered on commit
|
|
(James Dyer, Tomas Fernandez Lobbe)
|
|
|
|
* SOLR-3375: Fix charset problems with HttpSolrServer (Roger Håkansson, yonik, siren)
|
|
|
|
* SOLR-3436: Group count incorrect when not all shards are queried in the second
|
|
pass. (Francois Perron, Martijn van Groningen)
|
|
|
|
* SOLR-3454: Exception when using result grouping with main=true and using
|
|
wt=javabin. (Ludovic Boutros, Martijn van Groningen)
|
|
|
|
* SOLR-3489: Config file replication less error prone (Jochen Just via janhoy)
|
|
|
|
* SOLR-3477: SOLR does not start up when no cores are defined (Tomás Fernández Löbbe via tommaso)
|
|
|
|
* SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories
|
|
are respected now (Stanislaw Osinski, Dawid Weiss)
|
|
|
|
* SOLR-3360: More DIH bug fixes for the deprecated "threads" parameter.
|
|
(Mikhail Khludnev, Claudio R, via James Dyer)
|
|
|
|
* SOLR-3430: Added a new DIH test against a real SQL database. Fixed problems
|
|
revealed by this new test related to the expanded cache support added to
|
|
3.6/SOLR-2382 (James Dyer)
|
|
|
|
* SOLR-3336: SolrEntityProcessor substitutes most variables at query time.
|
|
(Michael Kroh, Lance Norskog, via Martijn van Groningen)
|
|
|
|
|
|
================== 3.6.0 ==================
|
|
More information about this release, including any errata related to the
|
|
release notes, upgrade instructions, or other changes may be found online at:
|
|
https://wiki.apache.org/solr/Solr3.6
|
|
|
|
Upgrading from Solr 3.5
|
|
----------------------
|
|
* SOLR-2983: As a consequence of moving the code which sets a MergePolicy from SolrIndexWriter to SolrIndexConfig,
|
|
(custom) MergePolicies should now have an empty constructor; thus an IndexWriter should not be passed as constructor
|
|
parameter but instead set using the setIndexWriter() method.
|
|
|
|
* As doGet() methods in SimplePostTool was changed to static, the client applications of this
|
|
class need to be recompiled.
|
|
|
|
* In Solr version 3.5 and earlier, HTMLStripCharFilter had known bugs in the
|
|
character offsets it provided, triggering e.g. exceptions in highlighting.
|
|
HTMLStripCharFilter has been re-implemented, addressing this and other
|
|
issues. See the entry for LUCENE-3690 in the Bug Fixes section below for a
|
|
detailed list of changes. For people who depend on the behavior of
|
|
HTMLStripCharFilter in Solr version 3.5 and earlier: the old implementation
|
|
(bugs and all) is preserved as LegacyHTMLStripCharFilter.
|
|
|
|
* As of Solr 3.6, the <indexDefaults> and <mainIndex> sections of solrconfig.xml are deprecated
|
|
and replaced with a new <indexConfig> section. Read more in SOLR-1052 below.
|
|
|
|
* SOLR-3040: The DIH's admin UI (dataimport.jsp) now requires DIH request handlers to start with
|
|
a '/'. (dsmiley)
|
|
|
|
* SOLR-3161: <requestDispatcher handleSelect="false"> is now the default. An existing config will
|
|
probably work as-is because handleSelect was explicitly enabled in default configs. HandleSelect
|
|
makes /select work as well as enables the 'qt' parameter. Instead, consider explicitly
|
|
configuring /select as is done in the example solrconfig.xml, and register your other search
|
|
handlers with a leading '/' which is a recommended practice. (David Smiley, Erik Hatcher)
|
|
|
|
* SOLR-3161: Don't use the 'qt' parameter with a leading '/'. It probably won't work in 4.0
|
|
and it's now limited in 3.6 to SearchHandler subclasses that aren't lazy-loaded.
|
|
|
|
* SOLR-2724: Specifying <defaultSearchField> and <solrQueryParser defaultOperator="..."/> in
|
|
schema.xml is now considered deprecated. Instead you are encouraged to specify these via the "df"
|
|
and "q.op" parameters in your request handler definition. (David Smiley)
|
|
|
|
* Bugs found and fixed in the SignatureUpdateProcessor that previously caused
|
|
some documents to produce the same signature even when the configured fields
|
|
contained distinct (non-String) values. Users of SignatureUpdateProcessor
|
|
are strongly advised that they should re-index as document signatures may
|
|
have now changed. (see SOLR-3200 & SOLR-3226 for details)
|
|
|
|
New Features
|
|
----------------------
|
|
* SOLR-2020: Add Java client that uses Apache Http Components http client (4.x).
|
|
(Chantal Ackermann, Ryan McKinley, Yonik Seeley, siren)
|
|
|
|
* SOLR-2854: Now load URL content stream data (via stream.url) when called for during request handling,
|
|
rather than loading URL content streams automatically regardless of use.
|
|
(David Smiley and Ryan McKinley via ehatcher)
|
|
|
|
* SOLR-2904: BinaryUpdateRequestHandler should be able to accept multiple update requests from
|
|
a stream (shalin)
|
|
|
|
* SOLR-1565: StreamingUpdateSolrServer supports RequestWriter API and therefore, javabin update
|
|
format (shalin)
|
|
|
|
* SOLR-2438 added MultiTermAwareComponent to the various classes to allow automatic lowercasing
|
|
for multiterm queries (wildcards, regex, prefix, range, etc). You can now optionally specify a
|
|
"multiterm" analyzer in our schema.xml, but Solr should "do the right thing" if you don't
|
|
specify <fieldType="multiterm"> (Pete Sturge Erick Erickson, Mentoring from Seeley and Muir)
|
|
|
|
* SOLR-2919: Added support for localized range queries when the analysis chain uses
|
|
CollationKeyFilter or ICUCollationKeyFilter. (Michael Sokolov, rmuir)
|
|
|
|
* SOLR-2982: Added BeiderMorseFilterFactory for Beider-Morse (BMPM) phonetic encoder. Upgrades
|
|
commons-codec to version 1.6 (Brooke Schreier Ganz, rmuir)
|
|
|
|
* SOLR-1843: A new "rootName" attribute is now available when
|
|
configuring <jmx/> in solrconfig.xml. If this attribute is set,
|
|
Solr will use it as the root name for all MBeans Solr exposes via
|
|
JMX. The default root name is "solr" followed by the core name.
|
|
(Constantijn Visinescu, hossman)
|
|
|
|
* SOLR-2906: Added LFU cache options to Solr. (Shawn Heisey via Erick Erickson)
|
|
|
|
* SOLR-3036: Ability to specify overwrite=false on the URL for XML updates.
|
|
(Sami Siren via yonik)
|
|
|
|
* SOLR-2603: Add the encoding function for alternate fields in highlighting.
|
|
(Massimo Schiavon, koji)
|
|
|
|
* SOLR-1729: Evaluation of NOW for date math is done only once per request for
|
|
consistency, and is also propagated to shards in distributed search.
|
|
Adding a parameter NOW=<time_in_ms> to the request will override the
|
|
current time. (Peter Sturge, yonik, Simon Willnauer)
|
|
|
|
* SOLR-1709: Distributed support for Date and Numeric Range Faceting
|
|
(Peter Sturge, David Smiley, hossman, Simon Willnauer)
|
|
|
|
* SOLR-3054, LUCENE-3671: Add TypeTokenFilterFactory that creates TypeTokenFilter
|
|
that filters tokens based on their TypeAttribute. (Tommaso Teofili via
|
|
Uwe Schindler)
|
|
|
|
* LUCENE-3305, SOLR-3056: Added Kuromoji morphological analyzer for Japanese.
|
|
See the 'text_ja' fieldtype in the example to get started.
|
|
(Christian Moen, Masaru Hasegawa via Robert Muir)
|
|
|
|
* SOLR-1860: StopFilterFactory, CommonGramsFilterFactory, and
|
|
CommonGramsQueryFilterFactory can optionally read stopwords in Snowball
|
|
format (specify format="snowball"). (Robert Muir)
|
|
|
|
* SOLR-3105: ElisionFilterFactory optionally allows the parameter
|
|
ignoreCase (default=false). (Robert Muir)
|
|
|
|
* LUCENE-3714: Add WFSTLookupFactory, a suggester that uses a weighted FST
|
|
for more fine-grained suggestions. (Mike McCandless, Dawid Weiss, Robert Muir)
|
|
|
|
* SOLR-3143: Add SuggestQueryConverter, a QueryConverter intended for
|
|
auto-suggesters. (Robert Muir)
|
|
|
|
* SOLR-3033: ReplicationHandler's backup command now supports a 'maxNumberOfBackups'
|
|
init param that can be used to delete all but the most recent N backups. (Torsten Krah, James Dyer)
|
|
|
|
* SOLR-2202: Currency FieldType, whith support for currencies and exchange rates
|
|
(Greg Fodor & Andrew Morrison via janhoy, rmuir, Uwe Schindler)
|
|
|
|
* SOLR-3026: eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
|
|
(janhoy, hossmann, Tomás Fernández Löbbe)
|
|
|
|
* SOLR-2826: URLClassify Update Processor (janhoy)
|
|
|
|
* SOLR-2764: Create a NorwegianLightStemmer and NorwegianMinimalStemmer (janhoy)
|
|
|
|
* SOLR-3221: Added the ability to directly configure aspects of the concurrency
|
|
and thread-pooling used within distributed search in solr. This allows for finer
|
|
grained controlled and can be tuned by end users to target their own specific
|
|
requirements. This builds on the work of the HttpCommComponent and uses the same configuration
|
|
block to configure the thread pool. The default configuration has
|
|
the same behaviour as solr 3.5, favouring throughput over latency. More
|
|
information can be found on the wiki (http://wiki.apache.org/solr/SolrConfigXml) (Greg Bowyer)
|
|
|
|
* SOLR-2001: The query component will substitute an empty query that matches
|
|
no documents if the query parser returns null. This also prevents an
|
|
exception from being thrown by the default parser if "q" is missing. (yonik)
|
|
- SOLR-435: if q is "" then it's also acceptable. (dsmiley, hoss)
|
|
|
|
* SOLR-2919: Added parametric tailoring options to ICUCollationKeyFilterFactory.
|
|
These can be used to customize range query/sort behavior, for example to
|
|
support numeric collation, ignore punctuation/whitespace, ignore accents but
|
|
not case, control whether upper/lowercase values are sorted first, etc. (rmuir)
|
|
|
|
* SOLR-2346: Add a chance to set content encoding explicitly via content type
|
|
of stream for extracting request handler. This is convenient when Tika's
|
|
auto detector cannot detect encoding, especially the text file is too short
|
|
to detect encoding. (koji)
|
|
|
|
* SOLR-1499: Added SolrEntityProcessor that imports data from another Solr core
|
|
or instance based on a specified query.
|
|
(Lance Norskog, Erik Hatcher, Pulkit Singhal, Ahmet Arslan, Luca Cavanna,
|
|
Martijn van Groningen)
|
|
|
|
* SOLR-3190: Minor improvements to SolrEntityProcessor. Add more consistency
|
|
between solr parameters and parameters used in SolrEntityProcessor and
|
|
ability to specify a custom HttpClient instance.
|
|
(Luca Cavanna via Martijn van Groningen)
|
|
|
|
* SOLR-2382: Added pluggable cache support to DIH so that any Entity can be
|
|
made cache-able by adding the "cacheImpl" parameter. Include
|
|
"SortedMapBackedCache" to provide in-memory caching (as previously this was
|
|
the only option when using CachedSqlEntityProcessor). Users can provide
|
|
their own implementations of DIHCache for other caching strategies.
|
|
Deprecate CachedSqlEntityProcessor in favor of specifing "cacheImpl" with
|
|
SqlEntityProcessor. Make SolrWriter implement DIHWriter and allow the
|
|
possibility of pluggable Writers (DIH writing to something other than Solr).
|
|
(James Dyer, Noble Paul)
|
|
|
|
|
|
Optimizations
|
|
----------------------
|
|
* SOLR-1931: Speedup for LukeRequestHandler and admin/schema browser. New parameter
|
|
reportDocCount defaults to 'false'. Old behavior still possible by specifying this as 'true'
|
|
(Erick Erickson)
|
|
|
|
* SOLR-3012: Move System.getProperty("type") in postData() to main() and add type argument so that
|
|
the client applications of SimplePostTool can set content type via method argument. (koji)
|
|
|
|
* SOLR-2888: FSTSuggester refactoring: internal storage is now UTF-8,
|
|
external sorting (on disk) prevents OOMs even with large data sets
|
|
(the bottleneck is now FST construction), code cleanups and API cleanups.
|
|
(Dawid Weiss, Robert Muir)
|
|
|
|
Bug Fixes
|
|
----------------------
|
|
* SOLR-3187 SystemInfoHandler leaks filehandles (siren)
|
|
|
|
* LUCENE-3820: Fixed invalid position indexes by reimplementing PatternReplaceCharFilter.
|
|
This change also drops real support for boundary characters -- all input is prebuffered
|
|
for pattern matching. (Dawid Weiss)
|
|
|
|
* SOLR-3068: Fixed NPE in ThreadDumpHandler (siren)
|
|
|
|
* SOLR-2912: Fixed File descriptor leak in ShowFileRequestHandler (Michael Ryan, shalin)
|
|
|
|
* SOLR-2819: Improved speed of parsing hex entities in HTMLStripCharFilter
|
|
(Bernhard Berger, hossman)
|
|
|
|
* SOLR-2509: StringIndexOutOfBoundsException in the spellchecker collate when the term contains
|
|
a hyphen. (Thomas Gambier caught the bug, Steffen Godskesen did the patch, via Erick Erickson)
|
|
|
|
* SOLR-2955: Fixed IllegalStateException when querying with group.sort=score desc in sharded
|
|
environment. (Steffen Elberg Godskesen, Martijn van Groningen)
|
|
|
|
* SOLR-2956: Fixed inconsistencies in the flags (and flag key) reported by
|
|
the LukeRequestHandler (hossman)
|
|
|
|
* SOLR-1730: Made it clearer when a core failed to load as well as better logging when the
|
|
QueryElevationComponent fails to properly initialize (gsingers)
|
|
|
|
* SOLR-1520: QueryElevationComponent now supports non-string ids (gsingers)
|
|
|
|
* SOLR-3024: Fixed JSONTestUtil.matchObj, in previous releases it was not
|
|
respecting the 'delta' arg (David Smiley via hossman)
|
|
|
|
* SOLR-2542: Fixed DIH Context variables which were broken for all scopes other
|
|
then SCOPE_ENTITY (Linbin Chen & Frank Wesemann via hossman)
|
|
|
|
* SOLR-3042: Fixed Maven Jetty plugin configuration.
|
|
(David Smiley via Steve Rowe)
|
|
|
|
* SOLR-2970: CSV ResponseWriter returns fields defined as stored=false in schema (janhoy)
|
|
|
|
* LUCENE-3690, LUCENE-2208, SOLR-882, SOLR-42: Re-implemented
|
|
HTMLStripCharFilter as a JFlex-generated scanner and moved it to
|
|
lucene/contrib/analyzers/common/. See below for a list of bug fixes and
|
|
other changes. To get the same behavior as HTMLStripCharFilter in Solr
|
|
version 3.5 and earlier (including the bugs), use LegacyHTMLStripCharFilter,
|
|
which is the previous implementation.
|
|
|
|
Behavior changes from the previous version:
|
|
|
|
- Known offset bugs are fixed.
|
|
- The "Mark invalid" exceptions reported in SOLR-1283 are no longer
|
|
triggered (the bug is still present in LegacyHTMLStripCharFilter).
|
|
- The character entity "'" is now always properly decoded.
|
|
- More cases of <script> tags are now properly stripped.
|
|
- CDATA sections are now handled properly.
|
|
- Valid tag name characters now include the supplementary Unicode characters
|
|
from Unicode character classes [:ID_Start:] and [:ID_Continue:].
|
|
- Uppercase character entities """, "©", ">", "<", "®",
|
|
and "&" are now recognized and handled as if they were in lowercase.
|
|
- The REPLACEMENT CHARACTER U+FFFD is now used to replace numeric character
|
|
entities for unpaired UTF-16 low and high surrogates (in the range
|
|
[U+D800-U+DFFF]).
|
|
- Properly paired numeric character entities for UTF-16 surrogates are now
|
|
converted to the corresponding code units.
|
|
- Opening tags with unbalanced quotation marks are now properly stripped.
|
|
- Literal "<" and ">" characters in opening tags, regardless of whether they
|
|
appear inside quotation marks, now inhibit recognition (and stripping) of
|
|
the tags. The only exception to this is for values of event-handler
|
|
attributes, e.g. "onClick", "onLoad", "onSelect".
|
|
- A newline '\n' is substituted instead of a space for stripped HTML markup.
|
|
- Nothing is substituted for opening and closing inline tags - they are
|
|
simply removed. The list of inline tags is (case insensitively): <a>,
|
|
<abbr>, <acronym>, <b>, <basefont>, <bdo>, <big>, <cite>, <code>, <dfn>,
|
|
<em>, <font>, <i>, <img>, <input>, <kbd>, <label>, <q>, <s>, <samp>,
|
|
<select>, <small>, <span>, <strike>, <strong>, <sub>, <sup>, <textarea>,
|
|
<tt>, <u>, and <var>.
|
|
- HTMLStripCharFilterFactory now handles HTMLStripCharFilter's "escapedTags"
|
|
feature: opening and closing tags with the given names, including any
|
|
attributes and their values, are left intact in the output.
|
|
(Steve Rowe)
|
|
|
|
* LUCENE-3717: Fixed offset bugs in TrimFilter, WordDelimiterFilter, and
|
|
HyphenatedWordsFilter where they would create invalid offsets in
|
|
some situations, leading to problems in highlighting. (Robert Muir)
|
|
|
|
* SOLR-2280: commitWithin ignored for a delete query (Juan Grande via janhoy)
|
|
|
|
* SOLR-3073: Fixed 'Invalid UUID string' error when having an UUID field as
|
|
the unique key and executing a distributed grouping request. (Devon Krisman, Martijn van Groningen)
|
|
|
|
* SOLR-3084: Fixed initialization error when using
|
|
<queryResponseWriter default="true" ... /> (Bernd Fehling and hossman)
|
|
|
|
* SOLR-3109: Fixed numerous redundant shard requests when using distributed grouping.
|
|
(rblack via Martijn van Groningen)
|
|
|
|
* SOLR-3052: Fixed typo in distributed grouping parameters.
|
|
(Martijn van Groningen, Grant Ingersoll)
|
|
|
|
* SOLR-2909: Add support for ResourceLoaderAware tokenizerFactories in synonym
|
|
filter factories. (Tom Klonikowski, Jun Ohtani via Koji Sekiguchi)
|
|
|
|
* SOLR-3168: ReplicationHandler "numberToKeep" & "maxNumberOfBackups" parameters
|
|
would keep only 1 backup, even if more than 1 was specified (Neil Hooey, James Dyer)
|
|
|
|
* SOLR-3009: hitGrouped.vm isn't shipped with 3.x (ehatcher, janhoy)
|
|
|
|
* SOLR-3195: timeAllowed is ignored for grouping queries
|
|
(Russell Black via Martijn van Groningen)
|
|
|
|
* SOLR-2124: Do not log stack traces for "Service Disabled" / 503 Exceptions (PingRequestHandler, etc)
|
|
(James Dyer, others)
|
|
|
|
* SOLR-3260: DataImportHandler: ScriptTransformer gives better error messages when
|
|
problems arise on initalization (no Script Engine, invalid script, etc). (James Dyer)
|
|
|
|
* SOLR-2959: edismax now respects the magic fields '_val_' and '_query_'
|
|
(Michael Watts, hossman)
|
|
|
|
* SOLR-3074: fix SolrPluginUtils.docListToSolrDocumentList to respect the
|
|
list of fields specified. This fix also deprecates
|
|
DocumentBuilder.loadStoredFields which is not used anywhere in Solr,
|
|
and was fundamentally broken/bizarre.
|
|
(hossman, Ahmet Arslan)
|
|
|
|
* SOLR-2291: fix JSONWriter to respect field list when writing SolrDocuments
|
|
(Ahmet Arslan via hossman)
|
|
|
|
* SOLR-3264: Fix CoreContainer and SolrResourceLoader logging to be more
|
|
clear about when SolrCores are being created, and stop misleading people
|
|
about SolrCore instanceDir's being the "Solr Home Dir" (hossman)
|
|
|
|
* SOLR-3046: Fix whitespace typo in DIH response "Time taken" (hossman)
|
|
|
|
* SOLR-3261: Fix edismax to respect query operators when literal colons
|
|
are used in query string. (Juan Grande via hossman)
|
|
|
|
* SOLR-3226: Fix SignatureUpdateProcessor to no longer ignore non-String
|
|
field values (Spyros Kapnissis, hossman)
|
|
|
|
* SOLR-3200: Fix SignatureUpdateProcessor "all fields" mode to use all
|
|
fields of each document instead of the fields specified by the first
|
|
document indexed (Spyros Kapnissis via hossman)
|
|
|
|
* SOLR-3316: Distributed grouping failed when rows parameter was set to 0 and
|
|
sometimes returned a wrong hit count as matches. (Cody Young, Martijn van Groningen)
|
|
|
|
* SOLR-3107: contrib/langid: When using the LangDetect implementation of
|
|
langid, set the random seed to 0, so that the same document is detected as
|
|
the same language with the same probability every time.
|
|
(Christian Moen via rmuir)
|
|
|
|
* SOLR-2937: Configuring the number of contextual snippets used for
|
|
search results clustering. The hl.snippets parameter is now respected
|
|
by the clustering plugin, can be overridden by carrot.summarySnippets
|
|
if needed (Stanislaw Osinski).
|
|
|
|
* SOLR-2938: Clustering on multiple fields. The carrot.title and
|
|
carrot.snippet can now take comma- or space-separated lists of
|
|
field names to cluster (Stanislaw Osinski).
|
|
|
|
* SOLR-2939: Clustering of multilingual search results. The document's
|
|
language field be passed in the carrot.lang parameter, the carrot.lcmap
|
|
parameter enables mapping of language codes to ISO 639 (Stanislaw Osinski).
|
|
|
|
* SOLR-2940: Passing values for custom Carrot2 fields to Clustering component.
|
|
The custom field mapping are defined using the carrot.custom parameter
|
|
(Stanislaw Osinski).
|
|
|
|
* SOLR-2941: NullPointerException on clustering component initialization
|
|
when schema does not have a unique key field (Stanislaw Osinski).
|
|
|
|
* SOLR-2942: ClassCastException when passing non-textual fields to
|
|
clustering component (Stanislaw Osinski).
|
|
|
|
|
|
Other Changes
|
|
----------------------
|
|
* SOLR-2922: Upgrade commons-io and commons-lang to 2.1 and 2.6, respectively. (koji)
|
|
|
|
* SOLR-2920: Refactor frequent conditional use of DefaultSolrParams and
|
|
AppendedSolrParams into factory methods.
|
|
(David Smiley via hossman)
|
|
|
|
* SOLR-3032: Deprecate logOnce from SolrException logOnce and all the supporting
|
|
structure will disappear in 4.0. Errors should be caught and logged at the
|
|
top-most level or logged and NOT propagated up the chain. (Erick Erickson)
|
|
|
|
* SOLR-2718: Add ability to lazy load response writers, defined with startup="lazy".
|
|
(ehatcher)
|
|
|
|
* SOLR-2901: Upgrade Solr to Tika 1.0 (janhoy)
|
|
|
|
* SOLR-3059: Example XSL stylesheet for indexing query result XML (janhoy)
|
|
|
|
* SOLR-3097, SOLR-3105: Add analysis configurations for different languages to
|
|
the example. (Christian Moen, Robert Muir)
|
|
|
|
* SOLR-3005: Default QueryResponseWriters are now initialized via init() with an empty
|
|
NamedList. (Gasol Wu, Chris Male)
|
|
|
|
* SOLR-3140: Upgrade schema version to 1.5, where omitNorms defaults to "true" for all
|
|
primitive (non-analyzed) field types such as int, float, date, bool, string.. (janhoy)
|
|
|
|
* SOLR-3077: Better error messages when attempting to use "blank" field names
|
|
(Antony Stubbs via hossman)
|
|
|
|
* SOLR-2712: expecting fl=score to return all fields is now deprecated.
|
|
In solr 4.0, this will only return the score. (ryan)
|
|
|
|
* SOLR-3156: Check for Lucene directory locks at startup. In previous versions
|
|
this check was only performed during modifying (e.g. adding and deleting
|
|
documents) the index. (Luca Cavanna via Martijn van Groningen)
|
|
|
|
* SOLR-1052: Deprecated <indexDefaults> and <mainIndex> in solrconfig.xml
|
|
From now, all settings go in the new <indexConfig> tag, and some defaults are
|
|
changed: useCompoundFile=false, ramBufferSizeMB=32, lockType=native, so that
|
|
the effect of NOT specifying <indexConfig> at all gives same result as the
|
|
example config used to give in 3.5 (janhoy, gsingers)
|
|
|
|
* SOLR-3294: In contrib/clustering/lib/, replaced the manually retrowoven
|
|
Java 1.5-compatible carrot2-core-3.5.0.jar (which is not publicly available,
|
|
except from the Solr Subversion repository), with newly released Java
|
|
1.5-compatible carrot2-core-3.5.0.1.jar (hosted on the Maven Central
|
|
repository). Also updated dependencies jackson-core-asl and
|
|
jackson-mapper-asl (both v1.5.2 -> v1.7.4). (Dawid Weiss, Steve Rowe)
|
|
|
|
* SOLR-3295: netcdf jar is excluded from the binary release (and disabled in
|
|
ivy.xml) because it requires java 6. If you want to parse this content with
|
|
extracting request handler and are willing to use java 6, just add the jar.
|
|
(rmuir)
|
|
|
|
* SOLR-3142: DIH Imports no longer default optimize to true, instead false.
|
|
If you want to force all segments to be merged into one, you can specify
|
|
this parameter yourself. NOTE: this can be very expensive operation and
|
|
usually does not make sense for delta-imports. (Robert Muir)
|
|
|
|
Build
|
|
----------------------
|
|
* SOLR-2487: Add build target to package war without slf4j jars (janhoy)
|
|
|
|
* SOLR-3112: Fix tests not to write to src/test-files (Luca Cavanna via Robert Muir)
|
|
|
|
* LUCENE-3753: Restructure the Solr build system. (Steve Rowe)
|
|
|
|
* SOLR-3204: The packaged pre-release artifact of Commons CSV used the original
|
|
package name (org.apache.commons.csv). This created a compatibility issue as
|
|
the Apache Commons team works toward an official release of Commons CSV.
|
|
The source of Commons CSV was added under a separate package name to the
|
|
Solr source code. (Uwe Schindler, Chris Male, Emmanuel Bourg)
|
|
|
|
* LUCENE-3930: Changed build system to use Apache Ivy for retrival of 3rd
|
|
party JAR files. Please review README.txt for instructions.
|
|
(Robert Muir, Chris Male, Uwe Schindler, Steven Rowe, Hossman)
|
|
|
|
================== 3.5.0 ==================
|
|
|
|
New Features
|
|
----------------------
|
|
* SOLR-2749: Add boundary scanners for FastVectorHighlighter. <boundaryScanner/>
|
|
can be specified with a name in solrconfig.xml, and use hl.boundaryScanner=name
|
|
parameter to specify the named <boundaryScanner/>. (koji)
|
|
|
|
* SOLR-2066,SOLR-2776: Added support for distributed grouping.
|
|
(Martijn van Groningen, Jasper van Veghel, Matt Beaumont)
|
|
|
|
* SOLR-2769: Added factory for the new Hunspell stemmer capable of doing stemming
|
|
for 99 languages (janhoy, cmale)
|
|
|
|
* SOLR-1979: New contrib "langid". Adds language identification capabilities as an
|
|
Update Processor, using Tika's LanguageIdentifier or Cybozu language-detection
|
|
library (janhoy, Tommaso Teofili, gsingers)
|
|
|
|
* SOLR-2818: Added before/after count response parsing support for range facets in
|
|
SolrJ. (Bernhard Frauendienst via Martijn van Groningen)
|
|
|
|
* SOLR-2276: Add support for cologne phonetic to PhoneticFilterFactory.
|
|
(Marc Pompl via rmuir)
|
|
|
|
* SOLR-1926: Add hl.q parameter. (koji)
|
|
|
|
* SOLR-2881: Numeric types now support sortMissingFirst/Last. This includes Trie and date types
|
|
(Ryan McKinley, Mike McCandless, Uwe Schindler, Erick Erickson)
|
|
|
|
* SOLR-1023: StatsComponent now supports date fields and string fields.
|
|
(Chris Male, Mark Holland, Gunnlaugur Thor Briem, Ryan McKinley)
|
|
|
|
* SOLR-2578: ReplicationHandler's backup command now supports a 'numberToKeep'
|
|
request param that can be used to delete all but the most recent N backups.
|
|
(James Dyer via hossman)
|
|
|
|
* SOLR-2839: Add alternative implementation to contrib/langid supporting 53
|
|
languages, based on http://code.google.com/p/language-detection/ (rmuir)
|
|
|
|
Optimizations
|
|
----------------------
|
|
|
|
* SOLR-2742: SolrJ: Provide commitWithinMs as optional parameter for all add() methods,
|
|
making the feature more conveniently accessible for developers (janhoy)
|
|
|
|
Bug Fixes
|
|
----------------------
|
|
* SOLR-2748: The CommitTracker used for commitWith or autoCommit by maxTime
|
|
could commit too frequently and could block adds until a new searcher was
|
|
registered. (yonik)
|
|
|
|
* SOLR-2726: Fixed NullPointerException when using spellcheck.q with Suggester.
|
|
(Bernd Fehling, valentin via rmuir)
|
|
|
|
* SOLR-2772: Fixed Date parsing/formatting of years 0001-1000 (hossman)
|
|
|
|
* SOLR-2763: Extracting update request handler throws exception and returns 400
|
|
when zero-length file posted using multipart form post (janhoy)
|
|
|
|
* SOLR-2780: Fixed issue where multi select facets didn't respect group.truncate parameter.
|
|
(Martijn van Groningen, Ramzi Alqrainy)
|
|
|
|
* SOLR-2793: In rare cases (most likely during shutdown), a SolrIndexSearcher can be left
|
|
open if the executor rejects a task. (Mark Miller)
|
|
|
|
* SOLR-2791: Replication: abortfetch command is broken if replication was started
|
|
by fetchindex command instead of a regular poll (Yury Kats via shalin)
|
|
|
|
* SOLR-2861: Fix extremely rare race condition on commit that can result
|
|
in a NPE (yonik)
|
|
|
|
* SOLR-2813: Fix HTTP error codes returned when requests contain strings that
|
|
can not be parsed as numbers for Trie fields. (Jeff Crump and hossman)
|
|
|
|
* SOLR-2902: List of collations are wrong parsed in SpellCheckResponse causing
|
|
a wrong number of collation results in the response.
|
|
(Bastiaan Verhoef, James Dyer via Simon Willnauer)
|
|
|
|
* SOLR-2875: Fix the incorrect url in DIH example tika-data-config.xml
|
|
(Shinichiro Abe via koji)
|
|
|
|
Other Changes
|
|
----------------------
|
|
|
|
* SOLR-2750: Make both "update.chain" and the deprecated "update.param" work
|
|
consistently everywhere; see also SOLR-2105. (Mark Miller, janhoy)
|
|
|
|
* LUCENE-3410: Deprecated the WordDelimiterFilter constructors accepting multiple
|
|
ints masquerading as booleans. Preferred constructor now accepts a single int
|
|
bitfield (Chris Male)
|
|
|
|
* SOLR-2758: Moved ConcurrentLRUCache from o.a.s.common.util package in the solrj
|
|
module to the o.a.s.util package in the Solr core module.
|
|
(David Smiley via Steve Rowe)
|
|
|
|
* SOLR-2766: Package individual javadoc sites for solrj and test-framework.
|
|
(Steve Rowe, Mike McCandless)
|
|
|
|
* SOLR-2771: Solr modules' tests should not depend on solr-core test classes;
|
|
move BufferingRequestProcessor from solr-core tests to test-framework so that
|
|
the Solr Cell module can use it. (janhoy, Steve Rowe)
|
|
|
|
* LUCENE-3457: Upgrade commons-compress to 1.2 (Doron Cohen)
|
|
|
|
* SOLR-2757: min() and max() functions now support an arbitrary number of
|
|
ValueSources (Bill Bell via hossman)
|
|
|
|
* SOLR-2372: Upgrade Solr to Tika 0.10 (janhoy)
|
|
|
|
* SOLR-2792: Allow case insensitive Hunspell stemming (janhoy, rmuir)
|
|
|
|
* SOLR-2862: More explicit lexical resources location logged if Carrot2 clustering
|
|
extension is used. Fixed solr. impl. of IResource and IResourceLookup. (Dawid Weiss)
|
|
|
|
* SOLR-2849: Fix dependencies in Maven POMs. (David Smiley via Steve Rowe)
|
|
|
|
* SOLR-2591: Remove commitLockTimeout option from solrconfig.xml (Luca Cavanna via Martijn van Groningen)
|
|
|
|
* SOLR-2746: Upgraded UIMA dependencies from *-2.3.1-SNAPSHOT.jar to *-2.3.1.jar.
|
|
|
|
|
|
================== 3.4.0 ==================
|
|
|
|
Upgrading from Solr 3.3
|
|
----------------------
|
|
|
|
* The Lucene index format has changed and as a result, once you upgrade,
|
|
previous versions of Solr will no longer be able to read your indices.
|
|
In a master/slave configuration, all searchers/slaves should be upgraded
|
|
before the master. If the master were to be updated first, the older
|
|
searchers would not be able to read the new index format.
|
|
|
|
* Previous versions of Solr silently allow and ignore some contradictory
|
|
properties specified in schema.xml. For example:
|
|
- indexed="false" omitNorms="false"
|
|
- indexed="false" omitTermFreqAndPositions="false"
|
|
Field property validation has now been fixed, to ensure that
|
|
contradictions like these now generate error messages. If users
|
|
have existing schemas that generate one of these new "conflicting
|
|
'false' field options for non-indexed field" error messages the
|
|
conflicting "omit*" properties can safely be removed, or changed to
|
|
"true" for consistent behavior with previous Solr versions. This
|
|
situation has now been fixed to cause an error on startup when these
|
|
contradictory options. See SOLR-2669.
|
|
|
|
* FacetComponent no longer catches and embeds exceptions occurred during facet
|
|
processing, it throws HTTP 400 or 500 exceptions instead.
|
|
|
|
New Features
|
|
----------------------
|
|
|
|
* SOLR-2540: CommitWithin as an Update Request parameter
|
|
You can now specify &commitWithin=N (ms) on the update request (janhoy)
|
|
|
|
* SOLR-2458: post.jar enhanced to handle JSON, CSV and <optimize> (janhoy)
|
|
|
|
* LUCENE-3234: add a new parameter hl.phraseLimit for FastVectorHighlighter speed up.
|
|
(Mike Sokolov via koji)
|
|
|
|
* SOLR-2429: Ability to add cache=false to queries and query filters to avoid
|
|
using the filterCache or queryCache. A cost may also be specified and is used
|
|
to order the evaluation of non-cached filters from least to greatest cost .
|
|
For very expensive query filters (cost >= 100) if the query implements
|
|
the PostFilter interface, it will be used to obtain a Collector that is
|
|
checked only for documents that match the main query and all other filters.
|
|
The "frange" query now implements the PostFilter interface. (yonik)
|
|
|
|
* SOLR-2630: Added new XsltUpdateRequestHandler that works like
|
|
XmlUpdateRequestHandler but allows to transform the POSTed XML document
|
|
using XSLT. This allows to POST arbitrary XML documents to the update
|
|
handler, as long as you also provide a XSL to transform them to a valid
|
|
Solr input document. (Upayavira, Uwe Schindler)
|
|
|
|
* SOLR-2615: Log individual updates (adds and deletes) at the FINE level
|
|
before adding to the index. Fix a null pointer exception in logging
|
|
when there was no unique key. (David Smiley via yonik)
|
|
|
|
* LUCENE-2048: Added omitPositions to the schema, so you can omit position
|
|
information while still indexing term frequencies. (rmuir)
|
|
|
|
* SOLR-2584: add UniqFieldsUpdateProcessor that removes duplicate values in the
|
|
specified fields. (Elmer Garduno, koji)
|
|
|
|
* SOLR-2670: Added NIOFSDirectoryFactory (yonik)
|
|
|
|
* SOLR-2523: Added support in SolrJ to easily interact with range facets.
|
|
The range facet response can be parsed and is retrievable from the
|
|
QueryResponse class. The SolrQuery class has convenient methods for using
|
|
range facets. (Martijn van Groningen)
|
|
|
|
* SOLR-2637: Added support for group result parsing in SolrJ.
|
|
(Tao Cheng, Martijn van Groningen)
|
|
|
|
* SOLR-2665: Added post group faceting. Facet counts are based on the most
|
|
relevant document of each group matching the query. This feature has the
|
|
same impact on the StatsComponent. (Martijn van Groningen)
|
|
|
|
* SOLR-2675: CoreAdminHandler now allows arbitrary properties to be
|
|
specified when CREATEing a new SolrCore using property.* request
|
|
params. (Yury Kats, hossman)
|
|
|
|
* SOLR-2714: JSON update format - "null" field values are now dropped
|
|
instead of causing an exception. (Trygve Laugstøl, yonik)
|
|
|
|
|
|
Optimizations
|
|
----------------------
|
|
|
|
* LUCENE-3233: Improved memory usage, build time, and performance of
|
|
SynonymFilterFactory. (Mike McCandless, Robert Muir)
|
|
|
|
Bug Fixes
|
|
----------------------
|
|
|
|
* SOLR-2625: TermVectorComponent throws NPE if TF-IDF option is used without DF
|
|
option. (Daniel Erenrich, Simon Willnauer)
|
|
|
|
* SOLR-2631: PingRequestHandler should not allow to ping itself using "qt"
|
|
param to prevent infinite loop. (Edoardo Tosca, Uwe Schindler)
|
|
|
|
* SOLR-2636: Fix explain functionality for negative queries. (Tom Hill via yonik)
|
|
|
|
* SOLR-2538: Range Faceting on long/double fields could overflow if values
|
|
bigger then the max int/float were used.
|
|
(Erbi Hanka, hossman)
|
|
|
|
* SOLR-2230: CommonsHttpSolrServer.addFile could not be used to send
|
|
multiple files in a single request.
|
|
(Stephan Günther, hossman)
|
|
|
|
* SOLR-2541: PluginInfos was not correctly parsing <long/> tags when
|
|
initializing plugins
|
|
(Frank Wesemann, hossman)
|
|
|
|
* SOLR-2623: Solr JMX MBeans do not survive core reloads (Alexey Serba, shalin)
|
|
|
|
* Fixed grouping bug when start is bigger than rows and format is simple that zero documents are returned even
|
|
if there are documents to display. (Martijn van Groningen, Nikhil Chhaochharia)
|
|
|
|
* SOLR-2564: Fixed ArrayIndexOutOfBoundsException when using simple format and
|
|
start > 0 (Martijn van Groningen, Matteo Melli)
|
|
|
|
* SOLR-2642: Fixed sorting by function when using grouping. (Thomas Heigl, Martijn van Groningen)
|
|
|
|
* SOLR-2535: REGRESSION: in Solr 3.x and trunk the admin/file handler
|
|
fails to show directory listings (David Smiley, Peter Wolanin via Erick Erickson)
|
|
|
|
* SOLR-2545: ExternalFileField file parsing would fail if any key
|
|
contained an "=" character. It now only looks for the last "=" delimiter
|
|
prior to the float value.
|
|
(Markus Jelsma, hossman)
|
|
|
|
* SOLR-2662: When Solr is configured to have no queryResultCache, the
|
|
"start" parameter was not honored and the documents returned were
|
|
0 through start+offset. (Markus Jelsma, yonik)
|
|
|
|
* SOLR-2669: Fix backwards validation of field properties in
|
|
SchemaField.calcProps (hossman)
|
|
|
|
* SOLR-2676: Add "welcome-file-list" to solr.war so admin UI works correctly
|
|
in servlet containers such as WebSphere that do not use a default list
|
|
(Jay R. Jaeger, hossman)
|
|
|
|
* SOLR-2606: Fixed sort parsing of fields containing punctuation that
|
|
failed due to sort by function changes introduced in SOLR-1297
|
|
(Mitsu Hadeishi, hossman)
|
|
|
|
* SOLR-2706: contrib/clustering: The carrot.lexicalResourcesDir parameter
|
|
now works with absolute directories (Stanislaw Osinski)
|
|
|
|
* SOLR-2692: contrib/clustering: Typo in param name fixed: "carrot.fragzise"
|
|
changed to "carrot.fragSize" (Stanislaw Osinski).
|
|
|
|
* SOLR-2644: When using DIH with threads=2 the default logging is set too high
|
|
(Bill Bell via shalin)
|
|
|
|
* SOLR-2492: DIH does not commit if only deletes are processed
|
|
(James Dyer via shalin)
|
|
|
|
* SOLR-2186: DataImportHandler's multi-threaded option throws NPE
|
|
(Lance Norskog, Frank Wesemann, shalin)
|
|
|
|
* SOLR-2655: DIH multi threaded mode does not resolve attributes correctly
|
|
(Frank Wesemann, shalin)
|
|
|
|
* SOLR-2695: DIH: Documents are collected in unsynchronized list in
|
|
multi-threaded debug mode (Michael McCandless, shalin)
|
|
|
|
* SOLR-2668: DIH multithreaded mode does not rollback on errors from
|
|
EntityProcessor (Frank Wesemann, shalin)
|
|
|
|
Other Changes
|
|
----------------------
|
|
|
|
* SOLR-2629: Eliminate deprecation warnings in some JSPs.
|
|
(Bernd Fehling, hossman)
|
|
|
|
* SOLR-2743: Remove commons logging from contrib/extraction. (koji)
|
|
|
|
|
|
Build
|
|
----------------------
|
|
|
|
* SOLR-2452,SOLR-2653,LUCENE-3323,SOLR-2659,LUCENE-3329,SOLR-2666:
|
|
Rewrote the Solr build system:
|
|
- Integrated more fully with the Lucene build system: generalized the
|
|
Lucene build system and eliminated duplication.
|
|
- Converted all Solr contribs to the Lucene/Solr conventional src/ layout:
|
|
java/, resources/, test/, and test-files/<contrib-name>.
|
|
- Created a new Solr-internal module named "core" by moving the java/,
|
|
test/, and test-files/ directories from solr/src/ to solr/core/src/.
|
|
- Merged solr/src/webapp/src/ into solr/core/src/java/.
|
|
- Eliminated solr/src/ by moving all its directories up one level;
|
|
renamed solr/src/site/ to solr/site-src/ because solr/site/ already
|
|
exists.
|
|
- Merged solr/src/common/ into solr/solrj/src/java/.
|
|
- Moved o.a.s.client.solrj.* and o.a.s.common.* tests from
|
|
solr/src/test/ to solr/solrj/src/test/.
|
|
- Made the solrj tests not depend on the solr core tests by moving
|
|
some classes from solr/src/test/ to solr/test-framework/src/java/.
|
|
- Each internal module (core/, solrj/, test-framework/, and webapp/)
|
|
now has its own build.xml, from which it is possible to run
|
|
module-specific targets. solr/build.xml delegates all build
|
|
tasks (via <ant dir="internal-module-dir"> calls) to these
|
|
modules' build.xml files.
|
|
(Steve Rowe, Robert Muir)
|
|
|
|
* LUCENE-3406: Add ant target 'package-local-src-tgz' to Lucene and Solr
|
|
to package sources from the local working copy.
|
|
(Seung-Yeoul Yang via Steve Rowe)
|
|
|
|
Documentation
|
|
----------------------
|
|
|
|
================== 3.3.0 ==================
|
|
|
|
Upgrading from Solr 3.2.0
|
|
----------------------
|
|
* SolrCore's CloseHook API has been changed in a backward-incompatible way. It
|
|
has been changed from an interface to an abstract class. Any custom
|
|
components which use the SolrCore.addCloseHook method will need to
|
|
be modified accordingly. To migrate, put your old CloseHook#close impl into
|
|
CloseHook#preClose.
|
|
|
|
New Features
|
|
----------------------
|
|
|
|
* SOLR-2378: A new, automaton-based, implementation of suggest (autocomplete)
|
|
component, offering an order of magnitude smaller memory consumption
|
|
compared to ternary trees and jaspell and very fast lookups at runtime.
|
|
(Dawid Weiss)
|
|
|
|
* SOLR-2400: Field- and DocumentAnalysisRequestHandler now provide a position
|
|
history for each token, so you can follow the token through all analysis stages.
|
|
The output contains a separate int[] attribute containing all positions from
|
|
previous Tokenizers/TokenFilters (called "positionHistory").
|
|
(Uwe Schindler)
|
|
|
|
* SOLR-2524: (SOLR-236, SOLR-237, SOLR-1773, SOLR-1311) Grouping / Field collapsing
|
|
using the Lucene grouping contrib. The search result can be grouped by field and query.
|
|
(Martijn van Groningen, Emmanuel Keller, Shalin Shekhar Mangar, Koji Sekiguchi,
|
|
Iván de Prado, Ryan McKinley, Marc Sturlese, Peter Karich, Bojan Smid,
|
|
Charles Hornberger, Dieter Grad, Dmitry Lihachev, Doug Steigerwald,
|
|
Karsten Sperling, Michael Gundlach, Oleg Gnatovskiy, Thomas Traeger,
|
|
Harish Agarwal, yonik, Michael McCandless, Bill Bell)
|
|
|
|
* SOLR-1331 -- Added a srcCore parameter to CoreAdminHandler's mergeindexes action
|
|
to merge one or more cores' indexes to a target core (shalin)
|
|
|
|
* SOLR-2610 -- Add an option to delete index through CoreAdmin UNLOAD action (shalin)
|
|
|
|
* SOLR-2480: Add ignoreTikaException flag to the extraction request handler so
|
|
that users can ignore TikaException but index meta data.
|
|
(Shinichiro Abe, koji)
|
|
|
|
* SOLR-2582: Use uniqueKey for error log in UIMAUpdateRequestProcessor.
|
|
(Tommaso Teofili via koji)
|
|
|
|
Optimizations
|
|
----------------------
|
|
|
|
* SOLR-2567: Solr now defaults to TieredMergePolicy. See http://s.apache.org/merging
|
|
for more information. (rmuir)
|
|
|
|
Bug Fixes
|
|
----------------------
|
|
|
|
* SOLR-2519: Improve text_* fieldTypes in example schema.xml: improve
|
|
cross-language defaults for text_general; break out separate
|
|
English-specific fieldTypes (Jan Høydahl, hossman, Robert Muir,
|
|
yonik, Mike McCandless)
|
|
|
|
* SOLR-2462: Fix extremely high memory usage problems with spellcheck.collate.
|
|
Separately, an additional spellcheck.maxCollationEvaluations (default=10000)
|
|
parameter is added to avoid excessive CPU time in extreme cases (e.g. long
|
|
queries with many misspelled words). (James Dyer via rmuir)
|
|
|
|
* SOLR-2579: UIMAUpdateRequestProcessor ignore error fails if text.length() < 100.
|
|
(Elmer Garduno via koji)
|
|
|
|
* SOLR-2581: UIMAToSolrMapper wrongly instantiates Type with reflection.
|
|
(Tommaso Teofili via koji)
|
|
|
|
* SOLR-2551: Check dataimport.properties for write access (if delta-import is
|
|
supported in DIH configuration) before starting an import (C S, shalin)
|
|
|
|
Other Changes
|
|
----------------------
|
|
|
|
* SOLR-2571: Add a commented out example of the spellchecker's thresholdTokenFrequency
|
|
parameter to the example solrconfig.xml, and also add a unit test for this feature.
|
|
(James Dyer via rmuir)
|
|
|
|
* SOLR-2576: Deprecate SpellingResult.add(Token token, int docFreq), please use
|
|
SpellingResult.addFrequency(Token token, int docFreq) instead.
|
|
(James Dyer via rmuir)
|
|
|
|
* SOLR-2574: Upgrade slf4j to v1.6.1 (shalin)
|
|
|
|
* LUCENE-3204: The maven-ant-tasks jar is now included in the source tree;
|
|
users of the generate-maven-artifacts target no longer have to manually
|
|
place this jar in the Ant classpath. NOTE: when Ant looks for the
|
|
maven-ant-tasks jar, it looks first in its pre-existing classpath, so
|
|
any copies it finds will be used instead of the copy included in the
|
|
Lucene/Solr source tree. For this reason, it is recommeded to remove
|
|
any copies of the maven-ant-tasks jar in the Ant classpath, e.g. under
|
|
~/.ant/lib/ or under the Ant installation's lib/ directory. (Steve Rowe)
|
|
|
|
* SOLR-2611: Fix typos in the example configuration (Eric Pugh via rmuir)
|
|
|
|
================== 3.2.0 ==================
|
|
Versions of Major Components
|
|
---------------------
|
|
Apache Lucene trunk
|
|
Apache Tika 0.8
|
|
Carrot2 3.4.2
|
|
|
|
|
|
Upgrading from Solr 3.1
|
|
----------------------
|
|
|
|
* The updateRequestProcessorChain for a RequestHandler is now defined
|
|
with update.chain rather than update.processor. The latter still works,
|
|
but has been deprecated.
|
|
|
|
* <uimaConfig/> just beneath <config> ... </config> is no longer supported.
|
|
It should move to UIMAUpdateRequestProcessorFactory setting.
|
|
See contrib/uima/README.txt for more details. (SOLR-2436)
|
|
|
|
Detailed Change List
|
|
----------------------
|
|
|
|
New Features
|
|
----------------------
|
|
|
|
* SOLR-2496: Add ability to specify overwrite and commitWithin as request
|
|
parameters (e.g. specified in the URL) when using the JSON update format,
|
|
and added a simplified format for specifying multiple documents.
|
|
Example: [{"id":"doc1"},{"id":"doc2"}]
|
|
(yonik)
|
|
|
|
* SOLR-2113: Add TermQParserPlugin, registered as "term". This is useful
|
|
when generating filter queries from terms returned from field faceting or
|
|
the terms component. Example: fq={!term f=weight}1.5 (hossman, yonik)
|
|
|
|
* SOLR-1915: DebugComponent now supports using a NamedList to model
|
|
Explanation objects in it's responses instead of
|
|
Explanation.toString (hossman)
|
|
|
|
* SOLR-2448: Search results clustering updates: bisecting k-means
|
|
clustering algorithm added, loading of Carrot2 stop words from
|
|
<solr.home>/conf/carrot2 (SOLR-2449), using Solr's stopwords.txt
|
|
for clustering (SOLR-2450), output of cluster scores (SOLR-2505)
|
|
(Stanislaw Osinski, Dawid Weiss).
|
|
|
|
* SOLR-2503: extend UIMAUpdateRequestProcessorFactory mapping function to
|
|
map feature value to dynamicField. (koji)
|
|
|
|
* SOLR-2512: add ignoreErrors flag to UIMAUpdateRequestProcessorFactory so
|
|
that users can ignore exceptions in AE. (Tommaso Teofili, koji)
|
|
|
|
Optimizations
|
|
----------------------
|
|
|
|
Bug Fixes
|
|
----------------------
|
|
|
|
* SOLR-2445: Change the default qt to blank in form.jsp, because there is no "standard"
|
|
request handler unless you have it in your solrconfig.xml explicitly. (koji)
|
|
|
|
* SOLR-2455: Prevent double submit of forms in admin interface.
|
|
(Jeffrey Chang via uschindler)
|
|
|
|
* SOLR-2464: Fix potential slowness in QueryValueSource (the query() function) when
|
|
the query is very sparse and may not match any documents in a segment. (yonik)
|
|
|
|
* SOLR-2469: When using java replication with replicateAfter=startup, the first
|
|
commit point on server startup is never removed. (yonik)
|
|
|
|
* SOLR-2466: SolrJ's CommonsHttpSolrServer would retry requests on failure, regardless
|
|
of the configured maxRetries, due to HttpClient having it's own retry mechanism
|
|
by default. The retryCount of HttpClient is now set to 0, and SolrJ does
|
|
the retry. (yonik)
|
|
|
|
* SOLR-2409: edismax parser - treat the text of a fielded query as a literal if the
|
|
fieldname does not exist. For example Mission: Impossible should not search on
|
|
the "Mission" field unless it's a valid field in the schema. (Ryan McKinley, yonik)
|
|
|
|
* SOLR-2403: facet.sort=index reported incorrect results for distributed search
|
|
in a number of scenarios when facet.mincount>0. This patch also adds some
|
|
performance/algorithmic improvements when (facet.sort=count && facet.mincount=1
|
|
&& facet.limit=-1) and when (facet.sort=index && facet.mincount>0) (yonik)
|
|
|
|
* SOLR-2333: The "rename" core admin action does not persist the new name to solr.xml
|
|
(Rasmus Hahn, Paul R. Brown via Mark Miller)
|
|
|
|
* SOLR-2390: Performance of usePhraseHighlighter is terrible on very large Documents,
|
|
regardless of hl.maxDocCharsToAnalyze. (Mark Miller)
|
|
|
|
* SOLR-2474: The helper TokenStreams in analysis.jsp and AnalysisRequestHandlerBase
|
|
did not clear all attributes so they displayed incorrect attribute values for tokens
|
|
in later filter stages. (uschindler, rmuir, yonik)
|
|
|
|
* SOLR-2467: Fix <analyzer class="..." /> initialization so any errors
|
|
are logged properly. (hossman)
|
|
|
|
* SOLR-2493: SolrQueryParser was fixed to not parse the SolrConfig DOM tree on each
|
|
instantiation which is a huge slowdown. (Stephane Bailliez via uschindler)
|
|
|
|
* SOLR-2495: The JSON parser could hang on corrupted input and could fail
|
|
to detect numbers that were too large to fit in a long. (yonik)
|
|
|
|
* SOLR-2520: Make JSON response format escape \u2029 as well as \u2028
|
|
in strings since those characters are not valid in javascript strings
|
|
(although they are valid in JSON strings). (yonik)
|
|
|
|
* SOLR-2536: Add ReloadCacheRequestHandler to fix ExternalFileField bug (if reopenReaders
|
|
set to true and no index segments have been changed, commit cannot trigger reload
|
|
external file). (koji)
|
|
|
|
* SOLR-2539: VectorValueSource.floatVal incorrectly used byteVal on sub-sources.
|
|
(Tom Liu via yonik)
|
|
|
|
* SOLR-2554: RandomSortField didn't work when used in a function query. (yonik)
|
|
|
|
|
|
Other Changes
|
|
----------------------
|
|
|
|
* SOLR-2061: Pull base tests out into a new Solr Test Framework module,
|
|
and publish binary, javadoc, and source test-framework jars.
|
|
(Drew Farris, Robert Muir, Steve Rowe)
|
|
|
|
* SOLR-2105: Rename RequestHandler param 'update.processor' to 'update.chain'.
|
|
(Jan Høydahl via Mark Miller)
|
|
|
|
* SOLR-2485: Deprecate BaseResponseWriter, GenericBinaryResponseWriter, and
|
|
GenericTextResponseWriter. These classes will be removed in 4.0. (ryan)
|
|
|
|
* SOLR-2451: Enhance assertJQ to allow individual tests to specify the
|
|
tolerance delta used in numeric equalities. This allows for slight
|
|
variance in asserting score comparisons in unit tests.
|
|
(David Smiley, Chris Hostetter)
|
|
|
|
* SOLR-2528: Remove default="true" from HtmlEncoder in example solrconfig.xml,
|
|
because html encoding confuses non-ascii users. (koji)
|
|
|
|
* SOLR-2387: add mock annotators for improved testing in contrib/uima,
|
|
(Tommaso Teofili via rmuir)
|
|
|
|
* SOLR-2436: move uimaConfig to under the uima's update processor in
|
|
solrconfig.xml. (Tommaso Teofili, koji)
|
|
|
|
Build
|
|
----------------------
|
|
|
|
* LUCENE-3006: Building javadocs will fail on warnings by default. Override with -Dfailonjavadocwarning=false (sarowe, gsingers)
|
|
|
|
|
|
Documentation
|
|
----------------------
|
|
|
|
|
|
================== 3.1.0 ==================
|
|
Versions of Major Components
|
|
---------------------
|
|
Apache Lucene 3.1.0
|
|
Apache Tika 0.8
|
|
Carrot2 3.4.2
|
|
Velocity 1.6.1 and Velocity Tools 2.0-beta3
|
|
Apache UIMA 2.3.1-SNAPSHOT
|
|
|
|
|
|
Upgrading from Solr 1.4
|
|
----------------------
|
|
|
|
* The Lucene index format has changed and as a result, once you upgrade,
|
|
previous versions of Solr will no longer be able to read your indices.
|
|
In a master/slave configuration, all searchers/slaves should be upgraded
|
|
before the master. If the master were to be updated first, the older
|
|
searchers would not be able to read the new index format.
|
|
|
|
* The Solr JavaBin format has changed as of Solr 3.1. If you are using the
|
|
JavaBin format, you will need to upgrade your SolrJ client. (SOLR-2034)
|
|
|
|
* The experimental ALIAS command has been removed (SOLR-1637)
|
|
|
|
* Using solr.xml is recommended for single cores also (SOLR-1621)
|
|
|
|
* Old syntax of <highlighting> configuration in solrconfig.xml
|
|
is deprecated (SOLR-1696)
|
|
|
|
* The deprecated HTMLStripReader, HTMLStripWhitespaceTokenizerFactory and
|
|
HTMLStripStandardTokenizerFactory were removed. To strip HTML tags,
|
|
HTMLStripCharFilter should be used instead, and it works with any
|
|
Tokenizer of your choice. (SOLR-1657)
|
|
|
|
* Field compression is no longer supported. Fields that were formerly
|
|
compressed will be uncompressed as index segments are merged. For
|
|
shorter fields, this may actually be an improvement, as the compression
|
|
used was not very good for short text. Some indexes may get larger though.
|
|
|
|
* SOLR-1845: The TermsComponent response format was changed so that the
|
|
"terms" container is a map instead of a named list. This affects
|
|
response formats like JSON, but not XML. (yonik)
|
|
|
|
* SOLR-1876: All Analyzers and TokenStreams are now final to enforce
|
|
the decorator pattern. (rmuir, uschindler)
|
|
|
|
* LUCENE-2608: Added the ability to specify the accuracy on a per request basis.
|
|
It is recommended that implementations of SolrSpellChecker should change over to the new SolrSpellChecker
|
|
methods using the new SpellingOptions class, but are not required to. While this change is
|
|
backward compatible, the trunk version of Solr has already dropped support for all but the SpellingOptions method. (gsingers)
|
|
|
|
* readercycle script was removed. (SOLR-2046)
|
|
|
|
* In previous releases, sorting or evaluating function queries on
|
|
fields that were "multiValued" (either by explicit declaration in
|
|
schema.xml or by implict behavior because the "version" attribute on
|
|
the schema was less then 1.2) did not generally work, but it would
|
|
sometimes silently act as if it succeeded and order the docs
|
|
arbitrarily. Solr will now fail on any attempt to sort, or apply a
|
|
function to, multi-valued fields
|
|
|
|
* The DataImportHandler jars are no longer included in the solr
|
|
WAR and should be added in Solr's lib directory, or referenced
|
|
via the <lib> directive in solrconfig.xml.
|
|
|
|
|
|
Detailed Change List
|
|
----------------------
|
|
|
|
New Features
|
|
----------------------
|
|
|
|
* SOLR-1302: Added several new distance based functions, including
|
|
Great Circle (haversine), Manhattan, Euclidean and String (using the
|
|
StringDistance methods in the Lucene spellchecker).
|
|
Also added geohash(), deg() and rad() convenience functions.
|
|
See http://wiki.apache.org/solr/FunctionQuery. (gsingers)
|
|
|
|
* SOLR-1553: New dismax parser implementation (accessible as "edismax")
|
|
that supports full lucene syntax, improved reserved char escaping,
|
|
fielded queries, improved proximity boosting, and improved stopword
|
|
handling. Note: status is experimental for now. (yonik)
|
|
|
|
* SOLR-1574: Add many new functions from java Math (e.g. sin, cos) (yonik)
|
|
|
|
* SOLR-1569: Allow functions to take in literal strings by modifying the
|
|
FunctionQParser and adding LiteralValueSource (gsingers)
|
|
|
|
* SOLR-1571: Added unicode collation support though Lucene's CollationKeyFilter
|
|
(Robert Muir via shalin)
|
|
|
|
* SOLR-785: Distributed Search support for SpellCheckComponent
|
|
(Matthew Woytowitz, shalin)
|
|
|
|
* SOLR-1625: Add regexp support for TermsComponent (Uri Boness via noble)
|
|
|
|
* SOLR-1297: Add sort by Function capability (gsingers, yonik)
|
|
|
|
* SOLR-1139: Add TermsComponent Query and Response Support in SolrJ (Matt Weber via shalin)
|
|
|
|
* SOLR-1177: Distributed Search support for TermsComponent (Matt Weber via shalin)
|
|
|
|
* SOLR-1621, SOLR-1722: Allow current single core deployments to be specified by solr.xml (Mark Miller , noble)
|
|
|
|
* SOLR-1532: Allow StreamingUpdateSolrServer to use a provided HttpClient (Gabriele Renzi via shalin)
|
|
|
|
* SOLR-1653: Add PatternReplaceCharFilter (koji)
|
|
|
|
* SOLR-1131: FieldTypes can now output multiple Fields per Type and still be searched. This can be handy for hiding the details of a particular
|
|
implementation such as in the spatial case. (Chris Mattmann, shalin, noble, gsingers, yonik)
|
|
|
|
* SOLR-1586: Add support for Geohash and Spatial Tile FieldType (Chris Mattmann, gsingers)
|
|
|
|
* SOLR-1697: PluginInfo should load plugins w/o class attribute also (noble)
|
|
|
|
* SOLR-1268: Incorporate FastVectorHighlighter (koji)
|
|
|
|
* SOLR-1750: SolrInfoMBeanHandler added for simpler programmatic access
|
|
to info currently available from registry.jsp and stats.jsp
|
|
(ehatcher, hossman)
|
|
|
|
* SOLR-1815: SolrJ now preserves the order of facet queries. (yonik)
|
|
|
|
* SOLR-1677: Add support for choosing the Lucene Version for Lucene components within
|
|
Solr. (Uwe Schindler, Mark Miller)
|
|
|
|
* SOLR-1379: Add RAMDirectoryFactory for non-persistent in memory index storage.
|
|
(Alex Baranov via yonik)
|
|
|
|
* SOLR-1857: Synced Solr analysis with Lucene 3.1. Added KeywordMarkerFilterFactory
|
|
and StemmerOverrideFilterFactory, which can be used to tune stemming algorithms.
|
|
Added factories for Bulgarian, Czech, Hindi, Turkish, and Wikipedia analysis. Improved the
|
|
performance of SnowballPorterFilterFactory. (rmuir)
|
|
|
|
* SOLR-1657: Converted remaining TokenStreams to the Attributes-based API. All Solr
|
|
TokenFilters now support custom Attributes, and some have improved performance:
|
|
especially WordDelimiterFilter and CommonGramsFilter. (rmuir, cmale, uschindler)
|
|
|
|
* SOLR-1740: ShingleFilterFactory supports the "minShingleSize" and "tokenSeparator"
|
|
parameters for controlling the minimum shingle size produced by the filter, and
|
|
the separator string that it uses, respectively. (Steven Rowe via rmuir)
|
|
|
|
* SOLR-744: ShingleFilterFactory supports the "outputUnigramsIfNoShingles"
|
|
parameter, to output unigrams if the number of input tokens is fewer than
|
|
minShingleSize, and no shingles can be generated.
|
|
(Chris Harris via Steven Rowe)
|
|
|
|
* SOLR-1923: PhoneticFilterFactory now has support for the
|
|
Caverphone algorithm. (rmuir)
|
|
|
|
* SOLR-1957: The VelocityResponseWriter contrib moved to core.
|
|
Example search UI now available at http://localhost:8983/solr/browse
|
|
(ehatcher)
|
|
|
|
* SOLR-1974: Add LimitTokenCountFilterFactory. (koji)
|
|
|
|
* SOLR-1966: QueryElevationComponent can now return just the included results in the elevation file (gsingers, yonik)
|
|
|
|
* SOLR-1556: TermVectorComponent now supports per field overrides. Also, it now throws an error
|
|
if passed in fields do not exist and warnings
|
|
if fields that do not have term vector options (termVectors, offsets, positions)
|
|
that align with the schema declaration. It also
|
|
will now return warnings about (gsingers)
|
|
|
|
* SOLR-1985: FastVectorHighlighter: add wrapper class for Lucene's SingleFragListBuilder (koji)
|
|
|
|
* SOLR-1984: Add HyphenationCompoundWordTokenFilterFactory. (PB via rmuir)
|
|
|
|
* SOLR-397: Date Faceting now supports a "facet.date.include" param
|
|
for specifying when the upper & lower end points of computed date
|
|
ranges should be included in the range. Legal values are: "all",
|
|
"lower", "upper", "edge", and "outer". For backwards compatibility
|
|
the default value is the set: [lower,upper,edge], so that all ranges
|
|
between start and end are inclusive of their endpoints, but the
|
|
"before" and "after" ranges are not.
|
|
|
|
* SOLR-945: JSON update handler that accepts add, delete, commit
|
|
commands in JSON format. (Ryan McKinley, yonik)
|
|
|
|
* SOLR-2015: Add a boolean attribute autoGeneratePhraseQueries to TextField.
|
|
autoGeneratePhraseQueries="true" (the default) causes the query parser to
|
|
generate phrase queries if multiple tokens are generated from a single
|
|
non-quoted analysis string. For example WordDelimiterFilter splitting text:pdp-11
|
|
will cause the parser to generate text:"pdp 11" rather than (text:PDP OR text:11).
|
|
Note that autoGeneratePhraseQueries="true" tends to not work well for non whitespace
|
|
delimited languages. (yonik)
|
|
|
|
* SOLR-1925: Add CSVResponseWriter (use wt=csv) that returns the list of documents
|
|
in CSV format. (Chris Mattmann, yonik)
|
|
|
|
* SOLR-1240: "Range Faceting" has been added. This is a generalization
|
|
of the existing "Date Faceting" logic so that it now supports any
|
|
all stock numeric field types that support range queries in addition
|
|
to dates. facet.date is now deprecated in favor of this generalized mechanism.
|
|
(Gijs Kunze, hossman)
|
|
|
|
* SOLR-2021: Add SolrEncoder plugin to Highlighter. (koji)
|
|
|
|
* SOLR-2030: Make FastVectorHighlighter use of SolrEncoder. (koji)
|
|
|
|
* SOLR-2053: Add support for custom comparators in Solr spellchecker, per LUCENE-2479 (gsingers)
|
|
|
|
* SOLR-2049: Add hl.multiValuedSeparatorChar for FastVectorHighlighter, per LUCENE-2603. (koji)
|
|
|
|
* SOLR-2059: Add "types" attribute to WordDelimiterFilterFactory, which
|
|
allows you to customize how WordDelimiterFilter tokenizes text with
|
|
a configuration file. (Peter Karich, rmuir)
|
|
|
|
* SOLR-2099: Add ability to throttle rsync based replication using rsync option --bwlimit.
|
|
(Brandon Evans via koji)
|
|
|
|
* SOLR-1316: Create autosuggest component.
|
|
(Ankul Garg, Jason Rutherglen, Shalin Shekhar Mangar, Grant Ingersoll, Robert Muir, ab)
|
|
|
|
* SOLR-1568: Added "native" filtering support for PointType, GeohashField. Added LatLonType with filtering support too. See
|
|
http://wiki.apache.org/solr/SpatialSearch and the example. Refactored some items in Lucene spatial.
|
|
Removed SpatialTileField as the underlying CartesianTier is broken beyond repair and is going to be moved. (gsingers)
|
|
|
|
* SOLR-2128: Full parameter substitution for function queries.
|
|
Example: q=add($v1,$v2)&v1=mul(popularity,5)&v2=20.0
|
|
(yonik)
|
|
|
|
* SOLR-2133: Function query parser can now parse multiple comma separated
|
|
value sources. It also now fails if there is extra unexpected text
|
|
after parsing the functions, instead of silently ignoring it.
|
|
This allows expressions like q=dist(2,vector(1,2),$pt)&pt=3,4 (yonik)
|
|
|
|
* SOLR-2157: Suggester should return alpha-sorted results when onlyMorePopular=false (ab)
|
|
|
|
* SOLR-2010: Added ability to verify that spell checking collations have
|
|
actual results in the index. (James Dyer via gsingers)
|
|
|
|
* SOLR-2188: Added "maxTokenLength" argument to the factories for ClassicTokenizer,
|
|
StandardTokenizer, and UAX29URLEmailTokenizer. (Steven Rowe)
|
|
|
|
* SOLR-2129: Added a Solr module for dynamic metadata extraction/indexing with Apache UIMA.
|
|
See contrib/uima/README.txt for more information. (Tommaso Teofili via rmuir)
|
|
|
|
* SOLR-2325: Allow tagging and exclusion of main query for faceting. (yonik)
|
|
|
|
* SOLR-2263: Add ability for RawResponseWriter to stream binary files as well as
|
|
text files. (Eric Pugh via yonik)
|
|
|
|
* SOLR-860: Add debug output for MoreLikeThis. (koji)
|
|
|
|
* SOLR-1057: Add PathHierarchyTokenizerFactory. (ryan, koji)
|
|
|
|
* SOLR-1804: Re-enabled clustering component on trunk, updated to latest
|
|
version of Carrot2. No more LGPL run-time dependencies. This release of
|
|
C2 also does not have a specific Lucene dependency.
|
|
(Stanislaw Osinski, gsingers)
|
|
|
|
* SOLR-2282: Add distributed search support for search result clustering.
|
|
(Brad Giaccio, Dawid Weiss, Stanislaw Osinski, rmuir, koji)
|
|
|
|
* SOLR-2210: Add icu-based tokenizer and filters to contrib/analysis-extras (rmuir)
|
|
|
|
* SOLR-1336: Add SmartChinese (word segmentation for Simplified Chinese)
|
|
tokenizer and filters to contrib/analysis-extras (rmuir)
|
|
|
|
* SOLR-2211,LUCENE-2763: Added UAX29URLEmailTokenizerFactory, which implements
|
|
UAX#29, a unicode algorithm with good results for most languages, as well as
|
|
URL and E-mail tokenization according to the relevant RFCs.
|
|
(Tom Burton-West via rmuir)
|
|
|
|
* SOLR-2237: Added StempelPolishStemFilterFactory to contrib/analysis-extras (rmuir)
|
|
|
|
* SOLR-1525: allow DIH to refer to core properties (noble)
|
|
|
|
* SOLR-1547: DIH TemplateTransformer copy objects more intelligently when the
|
|
template is a single variable (noble)
|
|
|
|
* SOLR-1627: DIH VariableResolver should be fetched just in time (noble)
|
|
|
|
* SOLR-1583: DIH Create DataSources that return InputStream (noble)
|
|
|
|
* SOLR-1358: Integration of Tika and DataImportHandler (Akshay Ukey, noble)
|
|
|
|
* SOLR-1654: TikaEntityProcessor example added DIHExample
|
|
(Akshay Ukey via noble)
|
|
|
|
* SOLR-1678: Move onError handling to DIH framework (noble)
|
|
|
|
* SOLR-1352: Multi-threaded implementation of DIH (noble)
|
|
|
|
* SOLR-1721: Add explicit option to run DataImportHandler in synchronous mode
|
|
(Alexey Serba via noble)
|
|
|
|
* SOLR-1737: Added FieldStreamDataSource (noble)
|
|
|
|
|
|
Optimizations
|
|
----------------------
|
|
|
|
* SOLR-1679: Don't build up string messages in SolrCore.execute unless they
|
|
are necessary for the current log level.
|
|
(Fuad Efendi and hossman)
|
|
|
|
* SOLR-1874: Optimize PatternReplaceFilter for better performance. (rmuir, uschindler)
|
|
|
|
* SOLR-1968: speed up initial filter cache population for facet.method=enum and
|
|
also big terms for multi-valued facet.method=fc. The resulting speedup
|
|
for the first facet request is anywhere from 30% to 32x, depending on how many
|
|
terms are in the field and how many documents match per term. (yonik)
|
|
|
|
* SOLR-2089: Speed up UnInvertedField faceting (facet.method=fc for
|
|
multi-valued fields) when facet.limit is both high, and a high enough
|
|
percentage of the number of unique terms in the field. Extreme cases
|
|
yield speedups over 3x. (yonik)
|
|
|
|
* SOLR-2046: add common functions to scripts-util. (koji)
|
|
|
|
* SOLR-1684: Switch clustering component to use the
|
|
SolrIndexSearcher.doc(int, Set<String>) method b/c it can use the document
|
|
cache (gsingers)
|
|
|
|
* SOLR-2200: Improve the performance of DataImportHandler for large
|
|
delta-import updates. (Mark Waddle via rmuir)
|
|
|
|
Bug Fixes
|
|
----------------------
|
|
* SOLR-1769: Solr 1.4 Replication - Repeater throwing NullPointerException (Jörgen Rydenius via noble)
|
|
|
|
* SOLR-1432: Make the new ValueSource.getValues(context,reader) delegate
|
|
to the original ValueSource.getValues(reader) so custom sources
|
|
will work. (yonik)
|
|
|
|
* SOLR-1572: FastLRUCache correctly implemented the LRU policy only
|
|
for the first 2B accesses. (yonik)
|
|
|
|
* SOLR-1582: copyField was ignored for BinaryField types (gsingers)
|
|
|
|
* SOLR-1563: Binary fields, including trie-based numeric fields, caused null
|
|
pointer exceptions in the luke request handler. (yonik)
|
|
|
|
* SOLR-1577: The example solrconfig.xml defaulted to a solr data dir
|
|
relative to the current working directory, even if a different solr home
|
|
was being used. The new behavior changes the default to a zero length
|
|
string, which is treated the same as if no dataDir had been specified,
|
|
hence the "data" directory under the solr home will be used. (yonik)
|
|
|
|
* SOLR-1584: SolrJ - SolrQuery.setIncludeScore() incorrectly added
|
|
fl=score to the parameter list instead of appending score to the
|
|
existing field list. (yonik)
|
|
|
|
* SOLR-1580: Solr Configuration ignores 'mergeFactor' parameter, always
|
|
uses Lucene default. (Lance Norskog via Mark Miller)
|
|
|
|
* SOLR-1593: ReverseWildcardFilter didn't work for surrogate pairs
|
|
(i.e. code points outside of the BMP), resulting in incorrect
|
|
matching. This change requires reindexing for any content with
|
|
such characters. (Robert Muir, yonik)
|
|
|
|
* SOLR-1596: A rollback operation followed by the shutdown of Solr
|
|
or the close of a core resulted in a warning:
|
|
"SEVERE: SolrIndexWriter was not closed prior to finalize()" although
|
|
there were no other consequences. (yonik)
|
|
|
|
* SOLR-1595: StreamingUpdateSolrServer used the platform default character
|
|
set when streaming updates, rather than using UTF-8 as the HTTP headers
|
|
indicated, leading to an encoding mismatch. (hossman, yonik)
|
|
|
|
* SOLR-1587: A distributed search request with fl=score, didn't match
|
|
the behavior of a non-distributed request since it only returned
|
|
the id,score fields instead of all fields in addition to score. (yonik)
|
|
|
|
* SOLR-1601: Schema browser does not indicate presence of charFilter. (koji)
|
|
|
|
* SOLR-1615: Backslash escaping did not work in quoted strings
|
|
for local param arguments. (Wojtek Piaseczny, yonik)
|
|
|
|
* SOLR-1628: log contains incorrect number of adds and deletes.
|
|
(Thijs Vonk via yonik)
|
|
|
|
* SOLR-343: Date faceting now respects facet.mincount limiting
|
|
(Uri Boness, Raiko Eckstein via hossman)
|
|
|
|
* SOLR-1624: Highlighter only highlights values from the first field value
|
|
in a multivalued field when term positions (term vectors) are stored.
|
|
(Chris Harris via yonik)
|
|
|
|
* SOLR-1635: Fixed error message when numeric values can't be parsed by
|
|
DOMUtils - notably for plugin init params in solrconfig.xml.
|
|
(hossman)
|
|
|
|
* SOLR-1651: Fixed Incorrect dataimport handler package name in SolrResourceLoader
|
|
(Akshay Ukey via shalin)
|
|
|
|
* SOLR-1660: CapitalizationFilter crashes if you use the maxWordCountOption
|
|
(Robert Muir via shalin)
|
|
|
|
* SOLR-1667: PatternTokenizer does not reset attributes such as positionIncrementGap
|
|
(Robert Muir via shalin)
|
|
|
|
* SOLR-1711: SolrJ - StreamingUpdateSolrServer had a race condition that
|
|
could halt the streaming of documents. The original patch to fix this
|
|
(never officially released) introduced another hanging bug due to
|
|
connections not being released.
|
|
(Attila Babo, Erik Hetzner, Johannes Tuchscherer via yonik)
|
|
|
|
* SOLR-1748, SOLR-1747, SOLR-1746, SOLR-1745, SOLR-1744: Streams and Readers
|
|
retrieved from ContentStreams are not closed in various places, resulting
|
|
in file descriptor leaks.
|
|
(Christoff Brill, Mark Miller)
|
|
|
|
* SOLR-1753: StatsComponent throws NPE when getting statistics for facets in distributed search
|
|
(Janne Majaranta via koji)
|
|
|
|
* SOLR-1736:In the slave , If 'mov'ing file does not succeed , copy the file (noble)
|
|
|
|
* SOLR-1579: Fixes to XML escaping in stats.jsp
|
|
(David Bowen and hossman)
|
|
|
|
* SOLR-1777: fieldTypes with sortMissingLast=true or sortMissingFirst=true can
|
|
result in incorrectly sorted results. (yonik)
|
|
|
|
* SOLR-1798: Small memory leak (~100 bytes) in fastLRUCache for every
|
|
commit. (yonik)
|
|
|
|
* SOLR-1823: Fixed XMLResponseWriter (via XMLWriter) so it no longer throws
|
|
a ClassCastException when a Map containing a non-String key is used.
|
|
(Frank Wesemann, hossman)
|
|
|
|
* SOLR-1797: fix ConcurrentModificationException and potential memory
|
|
leaks in ResourceLoader. (yonik)
|
|
|
|
* SOLR-1850: change KeepWordFilter so a new word set is not created for
|
|
each instance (John Wang via yonik)
|
|
|
|
* SOLR-1706: fixed WordDelimiterFilter for certain combinations of options
|
|
where it would output incorrect tokens. (Robert Muir, Chris Male)
|
|
|
|
* SOLR-1936: The JSON response format needed to escape unicode code point
|
|
U+2028 - 'LINE SEPARATOR' (Robert Hofstra, yonik)
|
|
|
|
* SOLR-1914: Change the JSON response format to output float/double
|
|
values of NaN,Infinity,-Infinity as strings. (yonik)
|
|
|
|
* SOLR-1948: PatternTokenizerFactory should use parent's args (koji)
|
|
|
|
* SOLR-1870: Indexing documents using the 'javabin' format no longer
|
|
fails with a ClassCastException whenSolrInputDocuments contain field
|
|
values which are Collections or other classes that implement
|
|
Iterable. (noble, hossman)
|
|
|
|
* SOLR-1981: Solr will now fail correctly if solr.xml attempts to
|
|
specify multiple cores that have the same name (hossman)
|
|
|
|
* SOLR-1791: Fix messed up core names on admin gui (yonik via koji)
|
|
|
|
* SOLR-1995: Change date format from "hour in am/pm" to "hour in day"
|
|
in CoreContainer and SnapShooter. (Hayato Ito, koji)
|
|
|
|
* SOLR-2008: avoid possible RejectedExecutionException w/autoCommit
|
|
by making SolreCore close the UpdateHandler before closing the
|
|
SearchExecutor. (NarasimhaRaju, hossman)
|
|
|
|
* SOLR-2036: Avoid expensive fieldCache ram estimation for the
|
|
admin stats page. (yonik)
|
|
|
|
* SOLR-2047: ReplicationHandler should accept bool type for enable flag. (koji)
|
|
|
|
* SOLR-1630: Fix spell checking collation issue related to token positions (rmuir, gsingers)
|
|
|
|
* SOLR-2100: The replication handler backup command didn't save the commit
|
|
point and hence could fail when a newer commit caused the older commit point
|
|
to be removed before it was finished being copied. This did not affect
|
|
normal master/slave replication. (Peter Sturge via yonik)
|
|
|
|
* SOLR-2114: Fixed parsing error in hsin function. The function signature has changed slightly. (gsingers)
|
|
|
|
* SOLR-2083: SpellCheckComponent misreports suggestions when distributed (James Dyer via gsingers)
|
|
|
|
* SOLR-2111: Change exception handling in distributed faceting to work more
|
|
like non-distributed faceting, change facet_counts/exception from a String
|
|
to a List<String> to enable listing all exceptions that happened, and
|
|
prevent an exception in one facet command from affecting another
|
|
facet command. (yonik)
|
|
|
|
* SOLR-2110: Remove the restriction on names for local params
|
|
substitution/dereferencing. Properly encode local params in
|
|
distributed faceting. (yonik)
|
|
|
|
* SOLR-2135: Fix behavior of ConcurrentLRUCache when asking for
|
|
getLatestAccessedItems(0) or getOldestAccessedItems(0).
|
|
(David Smiley via hossman)
|
|
|
|
* SOLR-2148: Highlighter doesn't support q.alt. (koji)
|
|
|
|
* SOLR-2180: It was possible for EmbeddedSolrServer to leave searchers
|
|
open if a request threw an exception. (yonik)
|
|
|
|
* SOLR-2173: Suggester should always rebuild Lookup data if Lookup.load fails. (ab)
|
|
|
|
* SOLR-2081: BaseResponseWriter.isStreamingDocs causes
|
|
SingleResponseWriter.end to be called 2x
|
|
(Chris A. Mattmann via hossman)
|
|
|
|
* SOLR-2219: The init() method of every SolrRequestHandler was being
|
|
called twice. (ambikeshwar singh and hossman)
|
|
|
|
* SOLR-2285: duplicate SolrEventListeners no longer created (hossman)
|
|
|
|
* SOLR-1993: fix String cast assumption in JavaBinCodec - specific
|
|
addresses "commitWithin" option on Update requests.
|
|
(noble, hossman, and Maxim Valyanskiy)
|
|
|
|
* SOLR-2261: fix velocity template layout.vm that referred to an older
|
|
version of jquery. (Eric Pugh via rmuir)
|
|
|
|
* SOLR-2307: fix bug in PHPSerializedResponseWriter (wt=phps) when
|
|
dealing with SolrDocumentList objects -- ie: sharded queries.
|
|
(Antonio Verni via hossman)
|
|
|
|
* SOLR-2127: Fixed serialization of default core and indentation of solr.xml when serializing.
|
|
(Ephraim Ofir, Mark Miller)
|
|
|
|
* SOLR-2320: Fixed ReplicationHandler detail reporting for masters
|
|
(hossman)
|
|
|
|
* SOLR-482: Provide more exception handling in CSVLoader (gsingers)
|
|
|
|
* SOLR-1283: HTMLStripCharFilter sometimes threw a "Mark Invalid" exception.
|
|
(Julien Coloos, hossman, yonik)
|
|
|
|
* SOLR-2085: Improve SolrJ behavior when FacetComponent comes before
|
|
QueryComponent (Tomas Salfischberger via hossman)
|
|
|
|
* SOLR-1940: Fix SolrDispatchFilter behavior when Content-Type is
|
|
unknown (Lance Norskog and hossman)
|
|
|
|
* SOLR-1983: snappuller fails when modifiedConfFiles is not empty and
|
|
full copy of index is needed. (Alexander Kanarsky via yonik)
|
|
|
|
* SOLR-2156: SnapPuller fails to clean Old Index Directories on Full Copy
|
|
(Jayendra Patil via yonik)
|
|
|
|
* SOLR-96: Fix XML parsing in XMLUpdateRequestHandler and
|
|
DocumentAnalysisRequestHandler to respect charset from XML file and only
|
|
use HTTP header's "Content-Type" as a "hint". (uschindler)
|
|
|
|
* SOLR-2339: Fix sorting to explicitly generate an error if you
|
|
attempt to sort on a multiValued field. (hossman)
|
|
|
|
* SOLR-2348: Fix field types to explicitly generate an error if you
|
|
attempt to get a ValueSource for a multiValued field. (hossman)
|
|
|
|
* SOLR-2380: Distributed faceting could miss values when facet.sort=index
|
|
and when facet.offset was greater than 0. (yonik)
|
|
|
|
* SOLR-1656: XIncludes and other HREFs in XML files loaded by ResourceLoader
|
|
are fixed to be resolved using the URI standard (RFC 2396). The system
|
|
identifier is no longer a plain filename with path, it gets initialized
|
|
using a custom URI scheme "solrres:". This scheme is resolved using a
|
|
EntityResolver that utilizes ResourceLoader
|
|
(org.apache.solr.common.util.SystemIdResolver). This makes all relative
|
|
pathes in Solr's config files behave like expected. This change
|
|
introduces some backwards breaks in the API: Some config classes
|
|
(Config, SolrConfig, IndexSchema) were changed to take
|
|
org.xml.sax.InputSource instead of InputStream. There may also be some
|
|
backwards breaks in existing config files, it is recommended to check
|
|
your config files / XSLTs and replace all XIncludes/HREFs that were
|
|
hacked to use absolute paths to use relative ones. (uschindler)
|
|
|
|
* SOLR-309: Fix FieldType so setting an analyzer on a FieldType that
|
|
doesn't expect it will generate an error. Practically speaking this
|
|
means that Solr will now correctly generate an error on
|
|
initialization if the schema.xml contains an analyzer configuration
|
|
for a fieldType that does not use TextField. (hossman)
|
|
|
|
* SOLR-2192: StreamingUpdateSolrServer.blockUntilFinished was not
|
|
thread safe and could throw an exception. (yonik)
|
|
|
|
* SOLR-1692: Fix bug in clustering component relating to carrot.produceSummary
|
|
option (gsingers)
|
|
|
|
* SOLR-1756: The date.format setting for extraction request handler causes
|
|
ClassCastException when enabled and the config code that parses this setting
|
|
does not properly use the same iterator instance.
|
|
(Christoph Brill, Mark Miller)
|
|
|
|
* SOLR-1638: Fixed NullPointerException during DIH import if uniqueKey is not
|
|
specified in schema (Akshay Ukey via shalin)
|
|
|
|
* SOLR-1639: Fixed misleading error message when dataimport.properties is not
|
|
writable (shalin)
|
|
|
|
* SOLR-1598: DIH: Reader used in PlainTextEntityProcessor is not explicitly
|
|
closed (Sascha Szott via noble)
|
|
|
|
* SOLR-1759: DIH: $skipDoc was not working correctly
|
|
(Gian Marco Tagliani via noble)
|
|
|
|
* SOLR-1762: DIH: DateFormatTransformer does not work correctly with
|
|
non-default locale dates (tommy chheng via noble)
|
|
|
|
* SOLR-1757: DIH multithreading sometimes throws NPE (noble)
|
|
|
|
* SOLR-1766: DIH with threads enabled doesn't respond to the abort command
|
|
(Michael Henson via noble)
|
|
|
|
* SOLR-1767: dataimporter.functions.escapeSql() does not escape backslash
|
|
character (Sean Timm via noble)
|
|
|
|
* SOLR-1811: formatDate should use the current NOW value always
|
|
(Sean Timm via noble)
|
|
|
|
* SOLR-1794: Dataimport of CLOB fields fails when getCharacterStream() is
|
|
defined in a superclass. (Gunnar Gauslaa Bergem via rmuir)
|
|
|
|
* SOLR-2057: DataImportHandler never calls UpdateRequestProcessor.finish()
|
|
(Drew Farris via koji)
|
|
|
|
* SOLR-1973: Empty fields in XML update messages confuse DataImportHandler.
|
|
(koji)
|
|
|
|
* SOLR-2221: Use StrUtils.parseBool() to get values of boolean options in DIH.
|
|
true/on/yes (for TRUE) and false/off/no (for FALSE) can be used for
|
|
sub-options (debug, verbose, synchronous, commit, clean, optimize) for
|
|
full/delta-import commands. (koji)
|
|
|
|
* SOLR-2310: DIH: getTimeElapsedSince() returns incorrect hour value when
|
|
the elapse is over 60 hours (tom liu via koji)
|
|
|
|
* SOLR-2252: DIH: When a child entity in nested entities is rootEntity="true",
|
|
delta-import doesn't work. (koji)
|
|
|
|
* SOLR-2330: solrconfig.xml files in example-DIH are broken. (Matt Parker, koji)
|
|
|
|
* SOLR-1191: resolve DataImportHandler deltaQuery column against pk when pk
|
|
has a prefix (e.g. pk="book.id" deltaQuery="select id from ..."). More
|
|
useful error reporting when no match found (previously failed with a
|
|
NullPointerException in log and no clear user feedback). (gthb via yonik)
|
|
|
|
* SOLR-2116: Fix TikaConfig classloader bug in TikaEntityProcessor
|
|
(Martijn van Groningen via hossman)
|
|
|
|
Other Changes
|
|
----------------------
|
|
|
|
* SOLR-1602: Refactor SOLR package structure to include o.a.solr.response
|
|
and move QueryResponseWriters in there
|
|
(Chris A. Mattmann, ryan, hoss)
|
|
|
|
* SOLR-1516: Addition of an abstract BaseResponseWriter class to simplify the
|
|
development of QueryResponseWriter implementations.
|
|
(Chris A. Mattmann via noble)
|
|
|
|
* SOLR-1592: Refactor XMLWriter startTag to allow arbitrary attributes to be written
|
|
(Chris A. Mattmann via noble)
|
|
|
|
* SOLR-1561: Added Lucene 2.9.1 spatial contrib jar to lib. (gsingers)
|
|
|
|
* SOLR-1570: Log warnings if uniqueKey is multi-valued or not stored (hossman, shalin)
|
|
|
|
* SOLR-1558: QueryElevationComponent only works if the uniqueKey field is
|
|
implemented using StrField. In previous versions of Solr no warning or
|
|
error would be generated if you attempted to use QueryElevationComponent,
|
|
it would just fail in unexpected ways. This has been changed so that it
|
|
will fail with a clear error message on initialization. (hossman)
|
|
|
|
* SOLR-1611: Added Lucene 2.9.1 collation contrib jar to lib (shalin)
|
|
|
|
* SOLR-1608: Extract base class from TestDistributedSearch to make
|
|
it easy to write test cases for other distributed components. (shalin)
|
|
|
|
* Upgraded to Lucene 2.9-dev r888785 (shalin)
|
|
|
|
* SOLR-1610: Generify SolrCache (Jason Rutherglen via shalin)
|
|
|
|
* SOLR-1637: Remove ALIAS command
|
|
|
|
* SOLR-1662: Added Javadocs in BufferedTokenStream and fixed incorrect cloning
|
|
in TestBufferedTokenStream (Robert Muir, Uwe Schindler via shalin)
|
|
|
|
* SOLR-1674: Improve analysis tests and cut over to new TokenStream API.
|
|
(Robert Muir via Mark Miller)
|
|
|
|
* SOLR-1661: Remove adminCore from CoreContainer . removed deprecated methods setAdminCore(), getAdminCore() (noble)
|
|
|
|
* SOLR-1704: Google collections moved from clustering to core (noble)
|
|
|
|
* SOLR-1268: Add Lucene 2.9-dev r888785 FastVectorHighlighter contrib jar to lib. (koji)
|
|
|
|
* SOLR-1538: Reordering of object allocations in ConcurrentLRUCache to eliminate
|
|
(an extremely small) potential for deadlock.
|
|
(gabriele renzi via hossman)
|
|
|
|
* SOLR-1588: Removed some very old dead code.
|
|
(Chris A. Mattmann via hossman)
|
|
|
|
* SOLR-1696 : Deprecate old <highlighting> syntax and move configuration to HighlightComponent (noble)
|
|
|
|
* SOLR-1727: SolrEventListener should extend NamedListInitializedPlugin (noble)
|
|
|
|
* SOLR-1771: Improved error message when StringIndex cannot be initialized
|
|
for a function query (hossman)
|
|
|
|
* SOLR-1695: Improved error messages when adding a document that does not
|
|
contain exactly one value for the uniqueKey field (hossman)
|
|
|
|
* SOLR-1776: DismaxQParser and ExtendedDismaxQParser now use the schema.xml
|
|
"defaultSearchField" as the default value for the "qf" param instead of failing
|
|
with an error when "qf" is not specified. (hossman)
|
|
|
|
* SOLR-1851: luceneAutoCommit no longer has any effect - it has been remove (Mark Miller)
|
|
|
|
* SOLR-1865: SolrResourceLoader.getLines ignores Byte Order Markers (BOMs) at the
|
|
beginning of input files, these are often created by editors such as Windows
|
|
Notepad. (rmuir, hossman)
|
|
|
|
* SOLR-1938: ElisionFilterFactory will use a default set of French contractions
|
|
if you do not supply a custom articles file. (rmuir)
|
|
|
|
* SOLR-2003: SolrResourceLoader will report any encoding errors, rather than
|
|
silently using replacement characters for invalid inputs (blargy via rmuir)
|
|
|
|
* SOLR-1804: Google collections updated to Google Guava (which is a superset of collections and contains bug fixes) (gsingers)
|
|
|
|
* SOLR-2034: Switch to JavaBin codec version 2. Strings are now serialized
|
|
as the number of UTF-8 bytes, followed by the bytes in UTF-8. Previously
|
|
Strings were serialized as the number of UTF-16 chars, followed by the
|
|
bytes in Modified UTF-8. (hossman, yonik, rmuir)
|
|
|
|
* SOLR-2013: Add mapping-FoldToASCII.txt to example conf directory.
|
|
(Steven Rowe via koji)
|
|
|
|
* SOLR-2213: Upgrade to jQuery 1.4.3 (Erick Erickson via ryan)
|
|
|
|
* SOLR-1826: Add unit tests for highlighting with termOffsets=true
|
|
and overlapping tokens. (Stefan Oestreicher via rmuir)
|
|
|
|
* SOLR-2340: Add version infos to message in JavaBinCodec when throwing
|
|
exception. (koji)
|
|
|
|
* SOLR-2350: Since Solr no longer requires XML files to be in UTF-8
|
|
(see SOLR-96) SimplePostTool (aka: post.jar) has been improved to
|
|
work with files of any mime-type or charset. (hossman)
|
|
|
|
* SOLR-2365: Move DIH jars out of solr.war (David Smiley via yonik)
|
|
|
|
* SOLR-2381: Include a patched version of Jetty (6.1.26 + JETTY-1340)
|
|
to fix problematic UTF-8 handling for supplementary characters.
|
|
(Bernd Fehling, uschindler, yonik, rmuir)
|
|
|
|
* SOLR-2391: The preferred Content-Type for XML was changed to
|
|
application/xml. XMLResponseWriter now only delivers using this
|
|
type; updating documents and analyzing documents is still supported
|
|
using text/xml as Content-Type, too. If you have clients that are
|
|
hardcoded on text/xml as Content-Type, you have to change them.
|
|
(uschindler, rmuir)
|
|
|
|
* SOLR-2414: All ResponseWriters now use only ServletOutputStreams
|
|
and wrap their own Writer around it when serializing. This fixes
|
|
the bug in PHPSerializedResponseWriter that produced wrong string
|
|
length if the servlet container had a broken UTF-8 encoding that was
|
|
in fact CESU-8 (see SOLR-1091). The system property to enable the
|
|
CESU-8 byte counting in PHPSerializesResponseWriters for broken
|
|
servlet containers was therefore removed and is now ignored if set.
|
|
Output is always UTF-8. (uschindler, yonik, rmuir)
|
|
|
|
* SOLR-141: Errors and Exceptions are formated by ResponseWriter.
|
|
(Mike Sokolov, Rich Cariens, Daniel Naber, ryan)
|
|
|
|
* SOLR-1902: Upgraded to Tika 0.8 and changed deprecated parse call
|
|
|
|
* SOLR-1813: Add ICU4j to contrib/extraction libs and add tests for Arabic
|
|
extraction (Robert Muir via gsingers)
|
|
|
|
* SOLR-1821: Fix TimeZone-dependent test failure in TestEvaluatorBag.
|
|
(Chris Male via rmuir)
|
|
|
|
* SOLR-2367: Reduced noise in test output by ensuring the properties file
|
|
can be written. (Gunnlaugur Thor Briem via rmuir)
|
|
|
|
Build
|
|
----------------------
|
|
|
|
* SOLR-1522: Automated release signing process. (gsingers)
|
|
|
|
* SOLR-1891: Make lucene-jars-to-solr fail if copying any of the jars fails, and
|
|
update clean to remove the jars in that directory (Mark Miller)
|
|
|
|
* LUCENE-2466: Commons-Codec was upgraded from 1.3 to 1.4. (rmuir)
|
|
|
|
* SOLR-2042: Fixed some Maven deps (Drew Farris via gsingers)
|
|
|
|
* LUCENE-2657: Switch from using Maven POM templates to full POMs when
|
|
generating Maven artifacts (Steven Rowe)
|
|
|
|
Documentation
|
|
----------------------
|
|
|
|
* SOLR-1590: Javadoc for XMLWriter#startTag
|
|
(Chris A. Mattmann via hossman)
|
|
|
|
* SOLR-1792: Documented peculiar behavior of TestHarness.LocalRequestFactory
|
|
(hossman)
|
|
|
|
================== Release 1.4.0 ==================
|
|
Release Date: See http://lucene.apache.org/solr for the official release date.
|
|
|
|
Upgrading from Solr 1.3
|
|
-----------------------
|
|
|
|
There is a new default faceting algorithm for multiVaued fields that should be
|
|
faster for most cases. One can revert to the previous algorithm (which has
|
|
also been improved somewhat) by adding facet.method=enum to the request.
|
|
|
|
Searching and sorting is now done on a per-segment basis, meaning that
|
|
the FieldCache entries used for sorting and for function queries are
|
|
created and used per-segment and can be reused for segments that don't
|
|
change between index updates. While generally beneficial, this can lead
|
|
to increased memory usage over 1.3 in certain scenarios:
|
|
1) A single valued field that was used for both sorting and faceting
|
|
in 1.3 would have used the same top level FieldCache entry. In 1.4,
|
|
sorting will use entries at the segment level while faceting will still
|
|
use entries at the top reader level, leading to increased memory usage.
|
|
2) Certain function queries such as ord() and rord() require a top level
|
|
FieldCache instance and can thus lead to increased memory usage. Consider
|
|
replacing ord() and rord() with alternatives, such as function queries
|
|
based on ms() for date boosting.
|
|
|
|
If you use custom Tokenizer or TokenFilter components in a chain specified in
|
|
schema.xml, they must support reusability. If your Tokenizer or TokenFilter
|
|
maintains state, it should implement reset(). If your TokenFilteFactory does
|
|
not return a subclass of TokenFilter, then it should implement reset() and call
|
|
reset() on it's input TokenStream. TokenizerFactory implementations must
|
|
now return a Tokenizer rather than a TokenStream.
|
|
|
|
New users of Solr 1.4 will have omitTermFreqAndPositions enabled for non-text
|
|
indexed fields by default, which avoids indexing term frequency, positions, and
|
|
payloads, making the index smaller and faster. If you are upgrading from an
|
|
earlier Solr release and want to enable omitTermFreqAndPositions by default,
|
|
change the schema version from 1.1 to 1.2 in schema.xml. Remove any existing
|
|
index and restart Solr to ensure that omitTermFreqAndPositions completely takes
|
|
affect.
|
|
|
|
The default QParserPlugin used by the QueryComponent for parsing the "q" param
|
|
has been changed, to remove support for the deprecated use of ";" as a separator
|
|
between the query string and the sort options when no "sort" param was used.
|
|
Users who wish to continue using the semi-colon based method of specifying the
|
|
sort options should explicitly set the defType param to "lucenePlusSort" on all
|
|
requests. (The simplest way to do this is by specifying it as a default param
|
|
for your request handlers in solrconfig.xml, see the example solrconfig.xml for
|
|
sample syntax.)
|
|
|
|
If spellcheck.extendedResults=true, the response format for suggestions
|
|
has changed, see SOLR-1071.
|
|
|
|
Use of the "charset" option when configuring the following Analysis
|
|
Factories has been deprecated and will cause a warning to be logged.
|
|
In future versions of Solr attempting to use this option will cause an
|
|
error. See SOLR-1410 for more information.
|
|
- GreekLowerCaseFilterFactory
|
|
- RussianStemFilterFactory
|
|
- RussianLowerCaseFilterFactory
|
|
- RussianLetterTokenizerFactory
|
|
|
|
DIH: Evaluator API has been changed in a non back-compatible way. Users who
|
|
have developed custom Evaluators will need to change their code according to
|
|
the new API for it to work. See SOLR-996 for details.
|
|
|
|
DIH: The formatDate evaluator's syntax has been changed. The new syntax is
|
|
formatDate(<variable>, '<format_string>'). For example,
|
|
formatDate(x.date, 'yyyy-MM-dd'). In the old syntax, the date string was
|
|
written without a single-quotes. The old syntax has been deprecated and will
|
|
be removed in 1.5, until then, using the old syntax will log a warning.
|
|
|
|
DIH: The Context API has been changed in a non back-compatible way. In
|
|
particular, the Context.currentProcess() method now returns a String
|
|
describing the type of the current import process instead of an int.
|
|
Similarily, the public constants in Context viz. FULL_DUMP, DELTA_DUMP and
|
|
FIND_DELTA are changed to a String type. See SOLR-969 for details.
|
|
|
|
DIH: The EntityProcessor API has been simplified by moving logic for applying
|
|
transformers and handling multi-row outputs from Transformers into an
|
|
EntityProcessorWrapper class. The EntityProcessor#destroy is now called once
|
|
per parent-row at the end of row (end of data). A new method
|
|
EntityProcessor#close is added which is called at the end of import.
|
|
|
|
DIH: In Solr 1.3, if the last_index_time was not available (first import) and
|
|
a delta-import was requested, a full-import was run instead. This is no longer
|
|
the case. In Solr 1.4 delta import is run with last_index_time as the epoch
|
|
date (January 1, 1970, 00:00:00 GMT) if last_index_time is not available.
|
|
|
|
Versions of Major Components
|
|
----------------------------
|
|
Apache Lucene 2.9.1 (r832363 on 2.9 branch)
|
|
Apache Tika 0.4
|
|
Carrot2 3.1.0
|
|
|
|
Lucene Information
|
|
----------------
|
|
|
|
Since Solr is built on top of Lucene, many people add customizations to Solr
|
|
that are dependent on Lucene. Please see http://lucene.apache.org/java/2_9_0/,
|
|
especially http://lucene.apache.org/java/2_9_0/changes/Changes.html for more
|
|
information on the version of Lucene used in Solr.
|
|
|
|
Detailed Change List
|
|
----------------------
|
|
|
|
New Features
|
|
----------------------
|
|
1. SOLR-560: Use SLF4J logging API rather then JDK logging. The packaged .war file is
|
|
shipped with a JDK logging implementation, so logging configuration for the .war should
|
|
be identical to solr 1.3. However, if you are using the .jar file, you can select
|
|
which logging implementation to use by dropping a different binding.
|
|
See: http://www.slf4j.org/ (ryan)
|
|
|
|
2. SOLR-617: Allow configurable index deletion policy and provide a default implementation which
|
|
allows deletion of commit points on various criteria such as number of commits, age of commit
|
|
point and optimized status.
|
|
See http://lucene.apache.org/java/2_3_2/api/org/apache/lucene/index/IndexDeletionPolicy.html
|
|
(yonik, Noble Paul, Akshay Ukey via shalin)
|
|
|
|
3. SOLR-658: Allow Solr to load index from arbitrary directory in dataDir
|
|
(Noble Paul, Akshay Ukey via shalin)
|
|
|
|
4. SOLR-793: Add 'commitWithin' argument to the update add command. This behaves
|
|
similar to the global autoCommit maxTime argument except that it is set for
|
|
each request. (ryan)
|
|
|
|
5. SOLR-670: Add support for rollbacks in UpdateHandler. This allows user to rollback all changes
|
|
since the last commit. (Noble Paul, koji via shalin)
|
|
|
|
6. SOLR-813: Adding DoubleMetaphone Filter and Factory. Similar to the PhoneticFilter,
|
|
but this uses DoubleMetaphone specific calls (including alternate encoding)
|
|
(Todd Feak via ryan)
|
|
|
|
7. SOLR-680: Add StatsComponent. This gets simple statistics on matched numeric fields,
|
|
including: min, max, mean, median, stddev. (koji, ryan)
|
|
|
|
- SOLR-1380: Added support for multi-valued fields (Harish Agarwal via gsingers)
|
|
|
|
8. SOLR-561: Added Replication implemented in Java as a request handler. Supports index replication
|
|
as well as configuration replication and exposes detailed statistics and progress information
|
|
on the Admin page. Works on all platforms. (Noble Paul, yonik, Akshay Ukey, shalin)
|
|
|
|
9. SOLR-746: Added "omitHeader" request parameter to omit the header from the response.
|
|
(Noble Paul via shalin)
|
|
|
|
10. SOLR-651: Added TermVectorComponent for serving up term vector information, plus IDF.
|
|
See http://wiki.apache.org/solr/TermVectorComponent (gsingers, Vaijanath N. Rao, Noble Paul)
|
|
|
|
12. SOLR-795: SpellCheckComponent supports building indices on optimize if configured in solrconfig.xml
|
|
(Jason Rennie, shalin)
|
|
|
|
13. SOLR-667: A LRU cache implementation based upon ConcurrentHashMap and other techniques to reduce
|
|
contention and synchronization overhead, to utilize multiple CPU cores more effectively.
|
|
(Fuad Efendi, Noble Paul, yonik via shalin)
|
|
|
|
14. SOLR-465: Add configurable DirectoryProvider so that alternate Directory
|
|
implementations can be specified via solrconfig.xml. The default
|
|
DirectoryProvider will use NIOFSDirectory for better concurrency
|
|
on non Windows platforms. (Mark Miller, TJ Laurenzo via yonik)
|
|
|
|
15. SOLR-822: Add CharFilter so that characters can be filtered (e.g. character normalization)
|
|
before Tokenizer/TokenFilters. (koji)
|
|
|
|
16. SOLR-829: Allow slaves to request compressed files from master during replication
|
|
(Simon Collins, Noble Paul, Akshay Ukey via shalin)
|
|
|
|
17. SOLR-877: Added TermsComponent for accessing Lucene's TermEnum capabilities.
|
|
Useful for auto suggest and possibly distributed search. Not distributed search compliant. (gsingers)
|
|
- Added mincount and maxcount options (Khee Chin via gsingers)
|
|
|
|
18. SOLR-538: Add maxChars attribute for copyField function so that the length limit for destination
|
|
can be specified.
|
|
(Georgios Stamatis, Lars Kotthoff, Chris Harris via koji)
|
|
|
|
19. SOLR-284: Added support for extracting content from binary documents like MS Word and PDF using Apache Tika. See also contrib/extraction/CHANGES.txt (Eric Pugh, Chris Harris, yonik, gsingers)
|
|
|
|
20. SOLR-819: Added factories for Arabic support (gsingers)
|
|
|
|
21. SOLR-781: Distributed search ability to sort field.facet values
|
|
lexicographically. facet.sort values "true" and "false" are
|
|
also deprecated and replaced with "count" and "lex".
|
|
(Lars Kotthoff via yonik)
|
|
|
|
22. SOLR-821: Add support for replication to copy conf file to slave with a different name. This allows replication
|
|
of solrconfig.xml
|
|
(Noble Paul, Akshay Ukey via shalin)
|
|
|
|
23. SOLR-911: Add support for multi-select faceting by allowing filters to be
|
|
tagged and facet commands to exclude certain filters. This patch also
|
|
added the ability to change the output key for facets in the response, and
|
|
optimized distributed faceting refinement by lowering parsing overhead and
|
|
by making requests and responses smaller.
|
|
|
|
24. SOLR-876: WordDelimiterFilter now supports a splitOnNumerics
|
|
option, as well as a list of protected terms.
|
|
(Dan Rosher via hossman)
|
|
|
|
25. SOLR-928: SolrDocument and SolrInputDocument now implement the Map<String,?>
|
|
interface. This should make plugging into other standard tools easier. (ryan)
|
|
|
|
26. SOLR-847: Enhance the snappull command in ReplicationHandler to accept masterUrl.
|
|
(Noble Paul, Preetam Rao via shalin)
|
|
|
|
27. SOLR-540: Add support for globbing in field names to highlight.
|
|
For example, hl.fl=*_text will highlight all fieldnames ending with
|
|
_text. (Lars Kotthoff via yonik)
|
|
|
|
28. SOLR-906: Adding a StreamingUpdateSolrServer that writes update commands to
|
|
an open HTTP connection. If you are using solrj for bulk update requests
|
|
you should consider switching to this implementaion. However, note that
|
|
the error handling is not immediate as it is with the standard SolrServer.
|
|
(ryan)
|
|
|
|
29. SOLR-865: Adding support for document updates in binary format and corresponding support in Solrj client.
|
|
(Noble Paul via shalin)
|
|
|
|
30. SOLR-763: Add support for Lucene's PositionFilter (Mck SembWever via shalin)
|
|
|
|
31. SOLR-966: Enhance the map() function query to take in an optional default value (Noble Paul, shalin)
|
|
|
|
32. SOLR-820: Support replication on startup of master with new index. (Noble Paul, Akshay Ukey via shalin)
|
|
|
|
33. SOLR-943: Make it possible to specify dataDir in solr.xml and accept the dataDir as a request parameter for
|
|
the CoreAdmin create command. (Noble Paul via shalin)
|
|
|
|
34. SOLR-850: Addition of timeouts for distributed searching. Configurable through 'shard-socket-timeout' and
|
|
'shard-connection-timeout' parameters in SearchHandler. (Patrick O'Leary via shalin)
|
|
|
|
35. SOLR-799: Add support for hash based exact/near duplicate document
|
|
handling. (Mark Miller, yonik)
|
|
|
|
36. SOLR-1026: Add protected words support to SnowballPorterFilterFactory (ehatcher)
|
|
|
|
37. SOLR-739: Add support for OmitTf (Mark Miller via yonik)
|
|
|
|
38. SOLR-1046: Nested query support for the function query parser
|
|
and lucene query parser (the latter existed as an undocumented
|
|
feature in 1.3) (yonik)
|
|
|
|
39. SOLR-940: Add support for Lucene's Trie Range Queries by providing new FieldTypes in
|
|
schema for int, float, long, double and date. Single-valued Trie based
|
|
fields with a precisionStep will index multiple precisions and enable
|
|
faster range queries. (Uwe Schindler, yonik, shalin)
|
|
|
|
40. SOLR-1038: Enhance CommonsHttpSolrServer to add docs in batch using an iterator API (Noble Paul via shalin)
|
|
|
|
41. SOLR-844: A SolrServer implementation to front-end multiple solr servers and provides load balancing and failover
|
|
support (Noble Paul, Mark Miller, hossman via shalin)
|
|
|
|
42. SOLR-939: ValueSourceRangeFilter/Query - filter based on values in a FieldCache entry or on any arbitrary function of field values. (yonik)
|
|
|
|
43. SOLR-1095: Fixed performance problem in the StopFilterFactory and simplified code. Added tests as well. (gsingers)
|
|
|
|
44. SOLR-1096: Introduced httpConnTimeout and httpReadTimeout in replication slave configuration to avoid stalled
|
|
replication. (Jeff Newburn, Noble Paul, shalin)
|
|
|
|
45. SOLR-1115: <bool>on</bool> and <bool>yes</bool> work as expected in solrconfig.xml. (koji)
|
|
|
|
46. SOLR-1099: A FieldAnalysisRequestHandler which provides the analysis functionality of the web admin page as
|
|
a service. The AnalysisRequestHandler is renamed to DocumentAnalysisRequestHandler which is enhanced with
|
|
query analysis and showMatch support. AnalysisRequestHandler is now deprecated. Support for both
|
|
FieldAnalysisRequestHandler and DocumentAnalysisRequestHandler is also provided in the Solrj client.
|
|
(Uri Boness, shalin)
|
|
|
|
47. SOLR-1106: Made CoreAdminHandler Actions pluggable so that additional actions may be plugged in or the existing
|
|
ones can be overridden if needed. (Kay Kay, Noble Paul, shalin)
|
|
|
|
48. SOLR-1124: Add a top() function query that causes it's argument to
|
|
have it's values derived from the top level IndexReader, even when
|
|
invoked from a sub-reader. top() is implicitly used for the
|
|
ord() and rord() functions. (yonik)
|
|
|
|
49. SOLR-1110: Support sorting on trie fields with Distributed Search. (Mark Miller, Uwe Schindler via shalin)
|
|
|
|
50. SOLR-1121: CoreAdminhandler should not need a core . This makes it possible to start a Solr server w/o a core .(noble)
|
|
|
|
51. SOLR-769: Added support for clustering in contrib/clustering. See http://wiki.apache.org/solr/ClusteringComponent for more info. (gsingers, Stanislaw Osinski)
|
|
|
|
52. SOLR-1175: disable/enable replication on master side. added two commands 'enableReplication' and 'disableReplication' (noble)
|
|
|
|
53. SOLR-1179: DocSets can now be used as Lucene Filters via
|
|
DocSet.getTopFilter() (yonik)
|
|
|
|
54. SOLR-1116: Add a Binary FieldType (noble)
|
|
|
|
55. SOLR-1051: Support the merge of multiple indexes as a CoreAdmin and an update command (Ning Li via shalin)
|
|
|
|
56. SOLR-1152: Snapshoot on ReplicationHandler should accept location as a request parameter (shalin)
|
|
|
|
57. SOLR-1204: Enhance SpellingQueryConverter to handle UTF-8 instead of ASCII only.
|
|
Use the NMTOKEN syntax for matching field names.
|
|
(Michael Ludwig, shalin)
|
|
|
|
58. SOLR-1189: Support providing username and password for basic HTTP authentication in Java replication
|
|
(Matthew Gregg, shalin)
|
|
|
|
59. SOLR-243: Add configurable IndexReaderFactory so that alternate IndexReader implementations
|
|
can be specified via solrconfig.xml. Note that using a custom IndexReader may be incompatible
|
|
with ReplicationHandler (see comments in SOLR-1366). This should be treated as an experimental feature.
|
|
(Andrzej Bialecki, hossman, Mark Miller, John Wang)
|
|
|
|
60. SOLR-1214: differentiate between solr home and instanceDir .deprecates the method SolrResourceLoader#locateInstanceDir()
|
|
and it is renamed to locateSolrHome (noble)
|
|
|
|
61. SOLR-1216 : disambiguate the replication command names. 'snappull' becomes 'fetchindex' 'abortsnappull' becomes 'abortfetch' (noble)
|
|
|
|
62. SOLR-1145: Add capability to specify an infoStream log file for the underlying Lucene IndexWriter in solrconfig.xml.
|
|
This is an advanced debug log file that can be used to aid developers in fixing IndexWriter bugs. See the commented
|
|
out example in the example solrconfig.xml under the indexDefaults section.
|
|
(Chris Harris, Mark Miller)
|
|
|
|
63. SOLR-1256: Show the output of CharFilters in analysis.jsp. (koji)
|
|
|
|
64. SOLR-1266: Added stemEnglishPossessive option (default=true) to WordDelimiterFilter
|
|
that allows disabling of english possessive stemming (removal of trailing 's from tokens)
|
|
(Robert Muir via yonik)
|
|
|
|
65. SOLR-1237: firstSearcher and newSearcher can now be identified via the CommonParams.EVENT (evt) parameter
|
|
in a request. This allows a RequestHandler or SearchComponent to know when a newSearcher or firstSearcher
|
|
event happened. QuerySenderListender is the only implementation in Solr that implements this, but outside
|
|
implementations may wish to. See the AbstractSolrEventListener for a helper method. (gsingers)
|
|
|
|
66. SOLR-1343: Added HTMLStripCharFilter and marked HTMLStripReader, HTMLStripWhitespaceTokenizerFactory and
|
|
HTMLStripStandardTokenizerFactory deprecated. To strip HTML tags, HTMLStripCharFilter can be used
|
|
with an arbitrary Tokenizer. (koji)
|
|
|
|
67. SOLR-1275: Add expungeDeletes to DirectUpdateHandler2 (noble)
|
|
|
|
68. SOLR-1372: Enhance FieldAnalysisRequestHandler to accept field value from content stream (ehatcher)
|
|
|
|
69. SOLR-1370: Show the output of CharFilters in FieldAnalysisRequestHandler (koji)
|
|
|
|
70. SOLR-1373: Add Filter query to admin/form.jsp
|
|
(Jason Rutherglen via hossman)
|
|
|
|
71. SOLR-1368: Add ms() function query for getting milliseconds from dates and for
|
|
high precision date subtraction, add sub() for subtracting other arguments.
|
|
(yonik)
|
|
|
|
72. SOLR-1156: Sort TermsComponent results by frequency (Matt Weber via yonik)
|
|
|
|
73. SOLR-1335 : load core properties from a properties file (noble)
|
|
|
|
74. SOLR-1385 : Add an 'enable' attribute to all plugins (noble)
|
|
|
|
75. SOLR-1414 : implicit core properties are not set for single core (noble)
|
|
|
|
76. SOLR-659 : Adds shards.start and shards.rows to distributed search
|
|
to allow more efficient bulk queries (those that retrieve many or all
|
|
documents). (Brian Whitman via yonik)
|
|
|
|
77. SOLR-1321: Add better support for efficient wildcard handling (Andrzej Bialecki, Robert Muir, gsingers)
|
|
|
|
78. SOLR-1326 : New interface PluginInfoInitialized for all types of plugin (noble)
|
|
|
|
79. SOLR-1447 : Simple property injection. <mergePolicy> & <mergeScheduler> syntaxes are now deprecated
|
|
(Jason Rutherglen, noble)
|
|
|
|
80. SOLR-908 : CommonGramsFilterFactory/CommonGramsQueryFilterFactory for
|
|
speeding up phrase queries containing common words by indexing
|
|
n-grams and using them at query time.
|
|
(Tom Burton-West, Jason Rutherglen via yonik)
|
|
|
|
81. SOLR-1292: Add FieldCache introspection to stats.jsp and JMX Monitoring via
|
|
a new SolrFieldCacheMBean. (hossman)
|
|
|
|
82. SOLR-1167: Solr Config now supports XInclude for XML engines that can support it. (Bryan Talbot via gsingers)
|
|
|
|
83. SOLR-1478: Enable sort by Lucene docid. (ehatcher)
|
|
|
|
84. SOLR-1449: Add <lib> elements to solrconfig.xml to specifying additional
|
|
classpath directories and regular expressions. (hossman via yonik)
|
|
|
|
85. SOLR-1128: Added metadata output to extraction request handler "extract
|
|
only" option. (gsingers)
|
|
|
|
86. SOLR-1274: Added text serialization output for extractOnly
|
|
(Peter Wolanin, gsingers)
|
|
|
|
87. SOLR-768: DIH: Set last_index_time variable in full-import command.
|
|
(Wojtek Piaseczny, Noble Paul via shalin)
|
|
|
|
88. SOLR-811: Allow a "deltaImportQuery" attribute in SqlEntityProcessor
|
|
which is used for delta imports instead of DataImportHandler manipulating
|
|
the SQL itself. (Noble Paul via shalin)
|
|
|
|
89. SOLR-842: Better error handling in DataImportHandler with options to
|
|
abort, skip and continue imports. (Noble Paul, shalin)
|
|
|
|
90. SOLR-833: DIH: A DataSource to read data from a field as a reader. This
|
|
can be used, for example, to read XMLs residing as CLOBs or BLOBs in
|
|
databases. (Noble Paul via shalin)
|
|
|
|
91. SOLR-887: A DIH Transformer to strip HTML tags. (Ahmed Hammad via shalin)
|
|
|
|
92. SOLR-886: DataImportHandler should rollback when an import fails or it is
|
|
aborted (shalin)
|
|
|
|
93. SOLR-891: A DIH Transformer to read strings from Clob type.
|
|
(Noble Paul via shalin)
|
|
|
|
94. SOLR-812: Configurable JDBC settings in JdbcDataSource including optimized
|
|
defaults for read only mode. (David Smiley, Glen Newton, shalin)
|
|
|
|
95. SOLR-910: Add a few utility commands to the DIH admin page such as full
|
|
import, delta import, status, reload config. (Ahmed Hammad via shalin)
|
|
|
|
96. SOLR-938: Add event listener API for DIH import start and end.
|
|
(Kay Kay, Noble Paul via shalin)
|
|
|
|
97. SOLR-801: DIH: Add support for configurable pre-import and post-import
|
|
delete query per root-entity. (Noble Paul via shalin)
|
|
|
|
98. SOLR-988: Add a new scope for session data stored in Context to store
|
|
objects across imports. (Noble Paul via shalin)
|
|
|
|
99. SOLR-980: A PlainTextEntityProcessor which can read from any
|
|
DataSource<Reader> and output a String.
|
|
(Nathan Adams, Noble Paul via shalin)
|
|
|
|
100.SOLR-1003: XPathEntityprocessor must allow slurping all text from a given
|
|
xml node and its children. (Noble Paul via shalin)
|
|
|
|
101.SOLR-1001: Allow variables in various attributes of RegexTransformer,
|
|
HTMLStripTransformer and NumberFormatTransformer.
|
|
(Fergus McMenemie, Noble Paul, shalin)
|
|
|
|
102.SOLR-989: DIH: Expose running statistics from the Context API.
|
|
(Noble Paul, shalin)
|
|
|
|
103.SOLR-996: DIH: Expose Context to Evaluators. (Noble Paul, shalin)
|
|
|
|
104.SOLR-783: DIH: Enhance delta-imports by maintaining separate
|
|
last_index_time for each entity. (Jon Baer, Noble Paul via shalin)
|
|
|
|
105.SOLR-1033: Current entity's namespace is made available to all DIH
|
|
Transformers. This allows one to use an output field of TemplateTransformer
|
|
in other transformers, among other things.
|
|
(Fergus McMenemie, Noble Paul via shalin)
|
|
|
|
106.SOLR-1066: New methods in DIH Context to expose Script details.
|
|
ScriptTransformer changed to read scripts through the new API methods.
|
|
(Noble Paul via shalin)
|
|
|
|
107.SOLR-1062: A DIH LogTransformer which can log data in a given template
|
|
format. (Jon Baer, Noble Paul via shalin)
|
|
|
|
108.SOLR-1065: A DIH ContentStreamDataSource which can accept HTTP POST data
|
|
in a content stream. This can be used to push data to Solr instead of
|
|
just pulling it from DB/Files/URLs. (Noble Paul via shalin)
|
|
|
|
109.SOLR-1061: Improve DIH RegexTransformer to create multiple columns from
|
|
regex groups. (Noble Paul via shalin)
|
|
|
|
110.SOLR-1059: Special DIH flags introduced for deleting documents by query or
|
|
id, skipping rows and stopping further transforms. Use $deleteDocById,
|
|
$deleteDocByQuery for deleting by id and query respectively. Use $skipRow
|
|
to skip the current row but continue with the document. Use $stopTransform
|
|
to stop further transformers. New methods are introduced in Context for
|
|
deleting by id and query. (Noble Paul, Fergus McMenemie, shalin)
|
|
|
|
111.SOLR-1076: JdbcDataSource should resolve DIH variables in all its
|
|
configuration parameters. (shalin)
|
|
|
|
112.SOLR-1055: Make DIH JdbcDataSource easily extensible by making the
|
|
createConnectionFactory method protected and return a
|
|
Callable<Connection> object. (Noble Paul, shalin)
|
|
|
|
113.SOLR-1058: DIH: JdbcDataSource can lookup javax.sql.DataSource using JNDI.
|
|
Use a jndiName attribute to specify the location of the data source.
|
|
(Jason Shepherd, Noble Paul via shalin)
|
|
|
|
114.SOLR-1083: A DIH Evaluator for escaping query characters.
|
|
(Noble Paul, shalin)
|
|
|
|
115.SOLR-934: A MailEntityProcessor to enable indexing mails from
|
|
POP/IMAP sources into a solr index. (Preetam Rao, shalin)
|
|
|
|
116.SOLR-1060: A DIH LineEntityProcessor which can stream lines of text from a
|
|
given file to be indexed directly or for processing with transformers and
|
|
child entities.
|
|
(Fergus McMenemie, Noble Paul, shalin)
|
|
|
|
117.SOLR-1127: Add support for DIH field name to be templatized.
|
|
(Noble Paul, shalin)
|
|
|
|
118.SOLR-1092: Added a new DIH command named 'import' which does not
|
|
automatically clean the index. This is useful and more appropriate when one
|
|
needs to import only some of the entities.
|
|
(Noble Paul via shalin)
|
|
|
|
119.SOLR-1153: DIH 'deltaImportQuery' is honored on child entities as well
|
|
(noble)
|
|
|
|
120.SOLR-1230: Enhanced dataimport.jsp to work with all DataImportHandler
|
|
request handler configurations, rather than just a hardcoded /dataimport
|
|
handler. (ehatcher)
|
|
|
|
121.SOLR-1235: disallow period (.) in DIH entity names (noble)
|
|
|
|
122.SOLR-1234: Multiple DIH does not work because all of them write to
|
|
dataimport.properties. Use the handler name as the properties file name
|
|
(noble)
|
|
|
|
123.SOLR-1348: Support binary field type in convertType logic in DIH
|
|
JdbcDataSource (shalin)
|
|
|
|
124.SOLR-1406: DIH: Make FileDataSource and FileListEntityProcessor to be more
|
|
extensible (Luke Forehand, shalin)
|
|
|
|
125.SOLR-1437: DIH: XPathEntityProcessor can deal with xpath syntaxes such as
|
|
//tagname , /root//tagname (Fergus McMenemie via noble)
|
|
|
|
|
|
Optimizations
|
|
----------------------
|
|
1. SOLR-374: Use IndexReader.reopen to save resources by re-using parts of the
|
|
index that haven't changed. (Mark Miller via yonik)
|
|
|
|
2. SOLR-808: Write string keys in Maps as extern strings in the javabin format. (Noble Paul via shalin)
|
|
|
|
3. SOLR-475: New faceting method with better performance and smaller memory usage for
|
|
multi-valued fields with many unique values but relatively few values per document.
|
|
Controllable via the facet.method parameter - "fc" is the new default method and "enum"
|
|
is the original method. (yonik)
|
|
|
|
4. SOLR-970: Use an ArrayList in SolrPluginUtils.parseQueryStrings
|
|
since we know exactly how long the List will be in advance.
|
|
(Kay Kay via hossman)
|
|
|
|
5. SOLR-1002: Change SolrIndexSearcher to use insertWithOverflow
|
|
with reusable priority queue entries to reduce the amount of
|
|
generated garbage during searching. (Mark Miller via yonik)
|
|
|
|
6. SOLR-971: Replace StringBuffer with StringBuilder for instances that do not require thread-safety.
|
|
(Kay Kay via shalin)
|
|
|
|
7. SOLR-921: SolrResourceLoader must cache short class name vs fully qualified classname
|
|
(Noble Paul, hossman via shalin)
|
|
|
|
8. SOLR-973: CommonsHttpSolrServer writes the xml directly to the server.
|
|
(Noble Paul via shalin)
|
|
|
|
9. SOLR-1108: Remove un-needed synchronization in SolrCore constructor.
|
|
(Noble Paul via shalin)
|
|
|
|
10. SOLR-1166: Speed up docset/filter generation by avoiding top-level
|
|
score() call and iterating over leaf readers with TermDocs. (yonik)
|
|
|
|
11. SOLR-1169: SortedIntDocSet - a new small set implementation
|
|
that saves memory over HashDocSet, is faster to construct,
|
|
is ordered for easier implementation of skipTo, and is faster
|
|
in the general case. (yonik)
|
|
|
|
12. SOLR-1165: Use Lucene Filters and pass them down to the Lucene
|
|
search methods to filter earlier and improve performance. (yonik)
|
|
|
|
13. SOLR-1111: Use per-segment sorting to share fieldcache elements
|
|
across unchanged segments. This saves memory and reduces
|
|
commit times for incremental updates to the index. (yonik)
|
|
|
|
14. SOLR-1188: Minor efficiency improvement in TermVectorComponent related to ignoring positions or offsets (gsingers)
|
|
|
|
15. SOLR-1150: Load Documents for Highlighting one at a time rather than
|
|
all at once to avoid OOM with many large Documents. (Siddharth Gargate via Mark Miller)
|
|
|
|
16. SOLR-1353: Implement and use reusable token streams for analysis. (Robert Muir, yonik)
|
|
|
|
17. SOLR-1296: Enables setting IndexReader's termInfosIndexDivisor via a new attribute to StandardIndexReaderFactory. Enables
|
|
setting termIndexInterval to IndexWriter via SolrIndexConfig. (Jason Rutherglen, hossman, gsingers)
|
|
|
|
18. SOLR-846: DIH: Reduce memory consumption during delta import by removing
|
|
keys when used (Ricky Leung, Noble Paul via shalin)
|
|
|
|
19. SOLR-974: DataImportHandler skips commit if no data has been updated.
|
|
(Wojtek Piaseczny, shalin)
|
|
|
|
20. SOLR-1004: DIH: Check for abort more frequently during delta-imports.
|
|
(Marc Sturlese, shalin)
|
|
|
|
21. SOLR-1098: DIH DateFormatTransformer can cache the format objects.
|
|
(Noble Paul via shalin)
|
|
|
|
22. SOLR-1465: Replaced string concatenations with StringBuilder append
|
|
calls in DIH XPathRecordReader. (Mark Miller, shalin)
|
|
|
|
Bug Fixes
|
|
----------------------
|
|
1. SOLR-774: Fixed logging level display (Sean Timm via Otis Gospodnetic)
|
|
|
|
2. SOLR-771: CoreAdminHandler STATUS should display 'normalized' paths (koji, hossman, shalin)
|
|
|
|
3. SOLR-532: WordDelimiterFilter now respects payloads and other attributes of the original Token by
|
|
using Token.clone() (Tricia Williams, gsingers)
|
|
|
|
4. SOLR-805: DisMax queries are not being cached in QueryResultCache (Todd Feak via koji)
|
|
|
|
5. SOLR-751: WordDelimiterFilter didn't adjust the start offset of single
|
|
tokens that started with delimiters, leading to incorrect highlighting.
|
|
(Stefan Oestreicher via yonik)
|
|
|
|
7. SOLR-843: SynonymFilterFactory cannot handle multiple synonym files correctly (koji)
|
|
|
|
8. SOLR-840: BinaryResponseWriter does not handle incompatible data in fields (Noble Paul via shalin)
|
|
|
|
9. SOLR-803: CoreAdminRequest.createCore fails because name parameter isn't set (Sean Colombo via ryan)
|
|
|
|
10. SOLR-869: Fix file descriptor leak in SolrResourceLoader#getLines (Mark Miller, shalin)
|
|
|
|
11. SOLR-872: Better error message for incorrect copyField destination (Noble Paul via shalin)
|
|
|
|
12. SOLR-879: Enable position increments in the query parser and fix the
|
|
example schema to enable position increments for the stop filter in
|
|
both the index and query analyzers to fix the bug with phrase queries
|
|
with stopwords. (yonik)
|
|
|
|
13. SOLR-836: Add missing "a" to the example stopwords.txt (yonik)
|
|
|
|
14. SOLR-892: Fix serialization of booleans for PHPSerializedResponseWriter
|
|
(yonik)
|
|
|
|
15. SOLR-898: Fix null pointer exception for the JSON response writer
|
|
based formats when nl.json=arrarr with null keys. (yonik)
|
|
|
|
16. SOLR-901: FastOutputStream ignores write(byte[]) call. (Noble Paul via shalin)
|
|
|
|
17. SOLR-807: BinaryResponseWriter writes fieldType.toExternal if it is not a supported type,
|
|
otherwise it writes fieldType.toObject. This fixes the bug with encoding/decoding UUIDField.
|
|
(koji, Noble Paul, shalin)
|
|
|
|
18. SOLR-863: SolrCore.initIndex should close the directory it gets for clearing the lock and
|
|
use the DirectoryFactory. (Mark Miller via shalin)
|
|
|
|
19. SOLR-802: Fix a potential null pointer error in the distributed FacetComponent
|
|
(David Bowen via ryan)
|
|
|
|
20. SOLR-346: Use perl regex to improve accuracy of finding latest snapshot in snapinstaller (billa)
|
|
|
|
21. SOLR-830: Use perl regex to improve accuracy of finding latest snapshot in snappuller (billa)
|
|
|
|
22. SOLR-897: Fixed Argument list too long error when there are lots of snapshots/backups (Dan Rosher via billa)
|
|
|
|
23. SOLR-925: Fixed highlighting on fields with multiValued="true" and termOffsets="true" (koji)
|
|
|
|
24. SOLR-902: FastInputStream#read(byte b[], int off, int len) gives incorrect results when amount left to read is less
|
|
than buffer size (Noble Paul via shalin)
|
|
|
|
25. SOLR-978: Old files are not removed from slaves after replication (Jaco, Noble Paul, shalin)
|
|
|
|
26. SOLR-883: Implicit properties are not set for Cores created through CoreAdmin (Noble Paul via shalin)
|
|
|
|
27. SOLR-991: Better error message when parsing solrconfig.xml fails due to malformed XML. Error message notes the name
|
|
of the file being parsed. (Michael Henson via shalin)
|
|
|
|
28. SOLR-1008: Fix stats.jsp XML encoding for <stat> item entries with ampersands in their names. (ehatcher)
|
|
|
|
29. SOLR-976: deleteByQuery is ignored when deleteById is placed prior to deleteByQuery in a <delete>.
|
|
Now both delete by id and delete by query can be specified at the same time as follows.
|
|
<delete>
|
|
<id>05991</id><id>06000</id>
|
|
<query>office:Bridgewater</query><query>office:Osaka</query>
|
|
</delete>
|
|
(koji)
|
|
|
|
30. SOLR-1016: HTTP 503 error changes 500 in SolrCore (koji)
|
|
|
|
31. SOLR-1015: Incomplete information in replication admin page and http command response when server
|
|
is both master and slave i.e. when server is a repeater (Akshay Ukey via shalin)
|
|
|
|
32. SOLR-1018: Slave is unable to replicate when server acts as repeater (as both master and slave)
|
|
(Akshay Ukey, Noble Paul via shalin)
|
|
|
|
33. SOLR-1031: Fix XSS vulnerability in schema.jsp (Paul Lovvik via ehatcher)
|
|
|
|
34. SOLR-1064: registry.jsp incorrectly displaying info for last core initialized
|
|
regardless of what the current core is. (hossman)
|
|
|
|
35. SOLR-1072: absolute paths used in sharedLib attribute were
|
|
incorrectly treated as relative paths. (hossman)
|
|
|
|
36. SOLR-1104: Fix some rounding errors in LukeRequestHandler's histogram (hossman)
|
|
|
|
37. SOLR-1125: Use query analyzer rather than index analyzer for queryFieldType in QueryElevationComponent
|
|
(koji)
|
|
|
|
38. SOLR-1126: Replicated files have incorrect timestamp (Jian Han Guo, Jeff Newburn, Noble Paul via shalin)
|
|
|
|
39. SOLR-1094: Incorrect value of correctlySpelled attribute in some cases (David Smiley, Mark Miller via shalin)
|
|
|
|
40. SOLR-965: Better error message when <pingQuery> is not configured.
|
|
(Mark Miller via hossman)
|
|
|
|
41. SOLR-1135: Java replication creates Snapshot in the directory where Solr was launched (Jianhan Guo via shalin)
|
|
|
|
42. SOLR-1138: Query Elevation Component now gracefully handles missing queries. (gsingers)
|
|
|
|
43. SOLR-929: LukeRequestHandler should return "dynamicBase" only if the field is dynamic.
|
|
(Peter Wolanin, koji)
|
|
|
|
44. SOLR-1141: NullPointerException during snapshoot command in java based replication (Jian Han Guo, shalin)
|
|
|
|
45. SOLR-1078: Fixes to WordDelimiterFilter to avoid splitting or dropping
|
|
international non-letter characters such as non spacing marks. (yonik)
|
|
|
|
46. SOLR-825, SOLR-1221: Enables highlighting for range/wildcard/fuzzy/prefix queries if using hl.usePhraseHighlighter=true
|
|
and hl.highlightMultiTerm=true. Also make both options default to true. (Mark Miller, yonik)
|
|
|
|
47. SOLR-1174: Fix Logging admin form submit url for multicore. (Jacob Singh via shalin)
|
|
|
|
48. SOLR-1182: Fix bug in OrdFieldSource#equals which could cause a bug with OrdFieldSource caching
|
|
on OrdFieldSource#hashcode collisions. (Mark Miller)
|
|
|
|
49. SOLR-1207: equals method should compare this and other of DocList in DocSetBase (koji)
|
|
|
|
50. SOLR-1242: Human readable JVM info from system handler does integer cutoff rounding, even when dealing
|
|
with GB. Fixed to round to one decimal place. (Jay Hill, Mark Miller)
|
|
|
|
51. SOLR-1243: Admin RequestHandlers should not be cached over HTTP. (Mark Miller)
|
|
|
|
52. SOLR-1260: Fix implementations of set operations for DocList subclasses
|
|
and fix a bug in HashDocSet construction when offset != 0. These bugs
|
|
never manifested in normal Solr use and only potentially affect
|
|
custom code. (yonik)
|
|
|
|
53. SOLR-1171: Fix LukeRequestHandler so it doesn't rely on SolrQueryParser
|
|
and report incorrect stats when field names contain characters
|
|
SolrQueryParser considers special.
|
|
(hossman)
|
|
|
|
54. SOLR-1317: Fix CapitalizationFilterFactory to work when keep parameter is not specified.
|
|
(ehatcher)
|
|
|
|
55. SOLR-1342: CapitalizationFilterFactory uses incorrect term length calculations.
|
|
(Robert Muir via Mark Miller)
|
|
|
|
56. SOLR-1359: DoubleMetaphoneFilter didn't index original tokens if there was no
|
|
alternative, and could incorrectly skip or reorder tokens. (yonik)
|
|
|
|
57. SOLR-1360: Prevent PhoneticFilter from producing duplicate tokens. (yonik)
|
|
|
|
58. SOLR-1371: LukeRequestHandler/schema.jsp errored if schema had no
|
|
uniqueKey field. The new test for this also (hopefully) adds some
|
|
future proofing against similar bugs in the future. As a side
|
|
effect QueryElevationComponentTest was refactored, and a bug in
|
|
that test was found. (hossman)
|
|
|
|
59. SOLR-914: General finalize() improvements. No finalizer delegates
|
|
to the respective close/destroy method w/o first checking if it's
|
|
already been closed/destroyed; if it hasn't a, SEVERE error is
|
|
logged first. (noble, hossman)
|
|
|
|
60. SOLR-1362: WordDelimiterFilter had inconsistent behavior when setting
|
|
the position increment of tokens following a token consisting of all
|
|
delimiters, and could additionally lose big position increments.
|
|
(Robert Muir, yonik)
|
|
|
|
61. SOLR-1091: Jetty's use of CESU-8 for code points outside the BMP
|
|
resulted in invalid output from the serialized PHP writer. (yonik)
|
|
|
|
62. SOLR-1103: LukeRequestHandler (and schema.jsp) have been fixed to
|
|
include the "1" (ie: 2**0) bucket in the term histogram data.
|
|
(hossman)
|
|
|
|
63. SOLR-1398: Add offset corrections in PatternTokenizerFactory.
|
|
(Anders Melchiorsen, koji)
|
|
|
|
64. SOLR-1400: Properly handle zero-length tokens in TrimFilter. This
|
|
was not a bug in any released version. (Peter Wolanin, gsingers)
|
|
|
|
65. SOLR-1071: spellcheck.extendedResults returns an invalid JSON response
|
|
when count > 1. To fix, the extendedResults format was changed.
|
|
(Uri Boness, yonik)
|
|
|
|
66. SOLR-1381: Fixed improper handling of fields that have only term positions and not term offsets during Highlighting (Thorsten Fischer, gsingers)
|
|
|
|
67. SOLR-1427: Fixed registry.jsp issue with MBeans (gsingers)
|
|
|
|
68. SOLR-1468: SolrJ's XML response parsing threw an exception for null
|
|
names, such as those produced when facet.missing=true (yonik)
|
|
|
|
69. SOLR-1471: Fixed issue with calculating missing values for facets in single valued cases in Stats Component.
|
|
This is not correctly calculated for the multivalued case. (James Miller, gsingers)
|
|
|
|
70. SOLR-1481: Fixed omitHeader parameter for PHP ResponseWriter. (Jun Ohtani via billa)
|
|
|
|
71. SOLR-1448: Add weblogic.xml to solr webapp to enable correct operation in
|
|
WebLogic. (Ilan Rabinovitch via yonik)
|
|
|
|
72. SOLR-1504: empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp and co.
|
|
(koji)
|
|
|
|
73. SOLR-1394: HTMLStripCharFilter split tokens that contained entities and
|
|
often calculated offsets incorrectly for entities.
|
|
(Anders Melchiorsen via yonik)
|
|
|
|
74. SOLR-1517: Admin pages could stall waiting for localhost name resolution
|
|
if reverse DNS wasn't configured; this was changed so the DNS resolution
|
|
is attempted only once the first time an admin page is loaded.
|
|
(hossman)
|
|
|
|
75. SOLR-1529: More than 8 deleteByQuery commands in a single request
|
|
caused an error to be returned, although the deletes were
|
|
still executed. (asmodean via yonik)
|
|
|
|
76. SOLR-800: Deep copy collections to avoid ConcurrentModificationException
|
|
in XPathEntityprocessor while streaming
|
|
(Kyle Morrison, Noble Paul via shalin)
|
|
|
|
77. SOLR-823: Request parameter variables ${dataimporter.request.xxx} are not
|
|
resolved in DIH (Mck SembWever, Noble Paul, shalin)
|
|
|
|
78. SOLR-728: Add synchronization to avoid race condition of multiple DIH
|
|
imports working concurrently (Walter Ferrara, shalin)
|
|
|
|
79. SOLR-742: Add ability to create dynamic fields with custom
|
|
DataImportHandler transformers (Wojtek Piaseczny, Noble Paul, shalin)
|
|
|
|
80. SOLR-832: Rows parameter is not honored in DIH non-debug mode and can
|
|
abort a running import in debug mode. (Akshay Ukey, shalin)
|
|
|
|
81. SOLR-838: The DIH VariableResolver obtained from a DataSource's context
|
|
does not have current data. (Noble Paul via shalin)
|
|
|
|
82. SOLR-864: DataImportHandler does not catch and log Errors (shalin)
|
|
|
|
83. SOLR-873: Fix case-sensitive field names and columns (Jon Baer, shalin)
|
|
|
|
84. SOLR-893: Unable to delete documents via SQL and deletedPkQuery with
|
|
deltaimport (Dan Rosher via shalin)
|
|
|
|
85. SOLR-888: DIH DateFormatTransformer cannot convert non-string type
|
|
(Amit Nithian via shalin)
|
|
|
|
86. SOLR-841: DataImportHandler should throw exception if a field does not
|
|
have column attribute (Michael Henson, shalin)
|
|
|
|
87. SOLR-884: CachedSqlEntityProcessor should check if the cache key is
|
|
present in the query results (Noble Paul via shalin)
|
|
|
|
88. SOLR-985: Fix thread-safety issue with DIH TemplateString for concurrent
|
|
imports with multiple cores. (Ryuuichi Kumai via shalin)
|
|
|
|
89. SOLR-999: DIH XPathRecordReader fails on XMLs with nodes mixed with
|
|
CDATA content. (Fergus McMenemie, Noble Paul via shalin)
|
|
|
|
90. SOLR-1000: DIH FileListEntityProcessor should not apply fileName filter to
|
|
directory names. (Fergus McMenemie via shalin)
|
|
|
|
91. SOLR-1009: Repeated column names result in duplicate values.
|
|
(Fergus McMenemie, Noble Paul via shalin)
|
|
|
|
92. SOLR-1017: Fix DIH thread-safety issue with last_index_time for concurrent
|
|
imports in multiple cores due to unsafe usage of SimpleDateFormat by
|
|
multiple threads. (Ryuuichi Kumai via shalin)
|
|
|
|
93. SOLR-1024: Calling abort on DataImportHandler import commits data instead
|
|
of calling rollback. (shalin)
|
|
|
|
94. SOLR-1037: DIH should not add null values in a row returned by
|
|
EntityProcessor to documents. (shalin)
|
|
|
|
95. SOLR-1040: DIH XPathEntityProcessor fails with an xpath like
|
|
/feed/entry/link[@type='text/html']/@href (Noble Paul via shalin)
|
|
|
|
96. SOLR-1042: Fix memory leak in DIH by making TemplateString non-static
|
|
member in VariableResolverImpl (Ryuuichi Kumai via shalin)
|
|
|
|
97. SOLR-1053: IndexOutOfBoundsException in DIH SolrWriter.getResourceAsString
|
|
when size of data-config.xml is a multiple of 1024 bytes.
|
|
(Herb Jiang via shalin)
|
|
|
|
98. SOLR-1077: IndexOutOfBoundsException with useSolrAddSchema in DIH
|
|
XPathEntityProcessor. (Sam Keen, Noble Paul via shalin)
|
|
|
|
99. SOLR-1080: DIH RegexTransformer should not replace if regex is not matched.
|
|
(Noble Paul, Fergus McMenemie via shalin)
|
|
|
|
100.SOLR-1090: DataImportHandler should load the data-config.xml using UTF-8
|
|
encoding. (Rui Pereira, shalin)
|
|
|
|
101.SOLR-1146: ConcurrentModificationException in DataImporter.getStatusMessages
|
|
(Walter Ferrara, Noble Paul via shalin)
|
|
|
|
102.SOLR-1229: Fixes for DIH deletedPkQuery, particularly when using
|
|
transformed Solr unique id's
|
|
(Lance Norskog, Noble Paul via ehatcher)
|
|
|
|
103.SOLR-1286: Fix the IH commit parameter always defaulting to "true" even
|
|
if "false" is explicitly passed in. (Jay Hill, Noble Paul via ehatcher)
|
|
|
|
104.SOLR-1323: Reset XPathEntityProcessor's $hasMore/$nextUrl when fetching
|
|
next URL (noble, ehatcher)
|
|
|
|
105.SOLR-1450: DIH: Jdbc connection properties such as batchSize are not
|
|
applied if the driver jar is placed in solr_home/lib.
|
|
(Steve Sun via shalin)
|
|
|
|
106.SOLR-1474: DIH Delta-import should run even if last_index_time is not set.
|
|
(shalin)
|
|
|
|
|
|
Other Changes
|
|
----------------------
|
|
1. Upgraded to Lucene 2.4.0 (yonik)
|
|
|
|
2. SOLR-805: Upgraded to Lucene 2.9-dev (r707499) (koji)
|
|
|
|
3. DumpRequestHandler (/debug/dump): changed 'fieldName' to 'sourceInfo'. (ehatcher)
|
|
|
|
4. SOLR-852: Refactored common code in CSVRequestHandler and XMLUpdateRequestHandler (gsingers, ehatcher)
|
|
|
|
5. SOLR-871: Removed dependency on stax-utils.jar. If you using solr.jar and running
|
|
java 6, you can also remove woodstox and geronimo. (ryan)
|
|
|
|
6. SOLR-465: Upgraded to Lucene 2.9-dev (r719351) (shalin)
|
|
|
|
7. SOLR-889: Upgraded to commons-io-1.4.jar and commons-fileupload-1.2.1.jar (ryan)
|
|
|
|
8. SOLR-875: Upgraded to Lucene 2.9-dev (r723985) and consolidated the BitSet implementations (Michael Busch, gsingers)
|
|
|
|
9. SOLR-819: Upgraded to Lucene 2.9-dev (r724059) to get access to Arabic public constructors (gsingers)
|
|
|
|
10. SOLR-900: Moved solrj into /src/solrj. The contents of solr-common.jar is now included
|
|
in the solr-solrj.jar. (ryan)
|
|
|
|
11. SOLR-924: Code cleanup: make all existing finalize() methods call
|
|
super.finalize() in a finally block. All current instances extend
|
|
Object, so this doesn't fix any bugs, but helps protect against
|
|
future changes. (Kay Kay via hossman)
|
|
|
|
12. SOLR-885: NamedListCodec is renamed to JavaBinCodec and returns Object instead of NamedList.
|
|
(Noble Paul, yonik via shalin)
|
|
|
|
13. SOLR-84: Use new Solr logo in admin (Michiel via koji)
|
|
|
|
14. SOLR-981: groupId for Woodstox dependency in maven solrj changed to org.codehaus.woodstox (Tim Taranov via shalin)
|
|
|
|
15. Upgraded to Lucene 2.9-dev r738218 (yonik)
|
|
|
|
16. SOLR-959: Refactored TestReplicationHandler to remove hardcoded port numbers (hossman, Akshay Ukey via shalin)
|
|
|
|
17. Upgraded to Lucene 2.9-dev r742220 (yonik)
|
|
|
|
18. SOLR-1022: Better "ignored" field in example schema.xml (Peter Wolanin via hossman)
|
|
|
|
19. SOLR-967: New type-safe constructor for NamedList (Kay Kay via hossman)
|
|
|
|
20. SOLR-1036: Change default QParser from "lucenePlusSort" to "lucene" to
|
|
reduce confusion of semicolon splitting behavior when no sort param is
|
|
specified (hossman)
|
|
|
|
21. Upgraded to Lucene 2.9-dev r752164 (shalin)
|
|
|
|
22. SOLR-1068: Use fsync on replicated index and configuration files (yonik, Noble Paul, shalin)
|
|
|
|
23. SOLR-952: Cleanup duplicated code in deprecated HighlightingUtils (hossman)
|
|
|
|
24. Upgraded to Lucene 2.9-dev r764281 (shalin)
|
|
|
|
25. SOLR-1079: Rename omitTf to omitTermFreqAndPositions (shalin)
|
|
|
|
26. SOLR-804: Added Lucene's misc contrib JAR (rev 764281). (gsingers)
|
|
|
|
27. Upgraded to Lucene 2.9-dev r768228 (shalin)
|
|
|
|
28. Upgraded to Lucene 2.9-dev r768336 (shalin)
|
|
|
|
29. SOLR-997: Wait for a longer time for slave to complete replication in TestReplicationHandler
|
|
(Mark Miller via shalin)
|
|
|
|
30. SOLR-748: FacetComponent helper classes are made public as an experimental API.
|
|
(Wojtek Piaseczny via shalin)
|
|
|
|
31. Upgraded to Lucene 2.9-dev 773862 (Mark Miller)
|
|
|
|
32. Upgraded to Lucene 2.9-dev r776177 (shalin)
|
|
|
|
33. SOLR-1149: Made QParserPlugin and related classes extendible as an experimental API.
|
|
(Kaktu Chakarabati via shalin)
|
|
|
|
34. Upgraded to Lucene 2.9-dev r779312 (yonik)
|
|
|
|
35. SOLR-786: Refactor DisMaxQParser to allow overriding certain features of DisMaxQParser
|
|
(Wojciech Biela via shalin)
|
|
|
|
36. SOLR-458: Add equals and hashCode methods to NamedList (Stefan Rinner, shalin)
|
|
|
|
37. SOLR-1184: Add option in solrconfig to open a new IndexReader rather than
|
|
using reopen. Done mainly as a fail-safe in the case that a user runs into
|
|
a reopen bug/issue. (Mark Miller)
|
|
|
|
38. SOLR-1215 use double quotes to enclose attributes in solr.xml (noble)
|
|
|
|
39. SOLR-1151: add dynamic copy field and maxChars example to example schema.xml.
|
|
(Peter Wolanin, Mark Miller)
|
|
|
|
40. SOLR-1233: remove /select?qt=/whatever restriction on /-prefixed request handlers.
|
|
(ehatcher)
|
|
|
|
41. SOLR-1257: logging.jsp has been removed and now passes through to the
|
|
hierarchical log level tool added in Solr 1.3. Users still
|
|
hitting "/admin/logging.jsp" should switch to "/admin/logging".
|
|
(hossman)
|
|
|
|
42. Upgraded to Lucene 2.9-dev r794238. Other changes include:
|
|
- LUCENE-1614 - Use Lucene's DocIdSetIterator.NO_MORE_DOCS as the sentinel value.
|
|
- LUCENE-1630 - Add acceptsDocsOutOfOrder method to Collector implementations.
|
|
- LUCENE-1673, LUCENE-1701 - Trie has moved to Lucene core and renamed to NumericRangeQuery.
|
|
- LUCENE-1662, LUCENE-1687 - Replace usage of ExtendedFieldCache by FieldCache.
|
|
(shalin)
|
|
|
|
42. SOLR-1241: Solr's CharFilter has been moved to Lucene. Remove CharFilter and related classes
|
|
from Solr and use Lucene's corresponding code (koji via shalin)
|
|
|
|
43. SOLR-1261: Lucene trunk renamed RangeQuery & Co to TermRangeQuery (Uwe Schindler via shalin)
|
|
|
|
44. Upgraded to Lucene 2.9-dev r801856 (Mark Miller)
|
|
|
|
45. SOLR-1276: Added StatsComponentTest (Rafał Kuć, gsingers)
|
|
|
|
46. SOLR-1377: The TokenizerFactory API has changed to explicitly return a Tokenizer
|
|
rather then a TokenStream (that may be or may not be a Tokenizer). This change
|
|
is required to take advantage of the Token reuse improvements in lucene 2.9. (ryan)
|
|
|
|
47. SOLR-1410: Log a warning if the deprecated charset option is used
|
|
on GreekLowerCaseFilterFactory, RussianStemFilterFactory,
|
|
RussianLowerCaseFilterFactory or RussianLetterTokenizerFactory.
|
|
(Robert Muir via hossman)
|
|
|
|
48. SOLR-1423: Due to LUCENE-1906, Solr's tokenizer should use Tokenizer.correctOffset() instead of CharStream.correctOffset().
|
|
(Uwe Schindler via koji)
|
|
|
|
49. SOLR-1319, SOLR-1345: Upgrade Solr Highlighter classes to new Lucene Highlighter API. This upgrade has
|
|
resulted in a back compat break in the DefaultSolrHighlighter class - getQueryScorer is no longer
|
|
protected. If you happened to be overriding that method in custom code, overide getHighlighter instead.
|
|
Also, HighlightingUtils#getQueryScorer has been removed as it was deprecated and backcompat has been
|
|
broken with it anyway. (Mark Miller)
|
|
|
|
50. SOLR-1357 SolrInputDocument cannot process dynamic fields (Lars Grote via noble)
|
|
|
|
51. SOLR-1075: Upgrade to Tika 0.3. See http://www.apache.org/dist/lucene/tika/CHANGES-0.3.txt (gsingers)
|
|
|
|
52. SOLR-1310: Upgrade to Tika 0.4. Note there are some differences in
|
|
detecting Languages now in extracting request handler.
|
|
See http://www.lucidimagination.com/search/document/d6f1899a85b2a45c/vote_apache_tika_0_4_release_candidate_2#d6f1899a85b2a45c
|
|
for discussion on language detection.
|
|
See http://www.apache.org/dist/lucene/tika/CHANGES-0.4.txt. (gsingers)
|
|
|
|
53. SOLR-782: DIH: Refactored SolrWriter to make it a concrete class and
|
|
removed wrappers over SolrInputDocument. Refactored to load Evaluators
|
|
lazily. Removed multiple document nodes in the configuration xml. Removed
|
|
support for 'default' variables, they are automatically available as
|
|
request parameters. (Noble Paul via shalin)
|
|
|
|
54. SOLR-964: DIH: XPathEntityProcessor now ignores DTD validations
|
|
(Fergus McMenemie, Noble Paul via shalin)
|
|
|
|
55. SOLR-1029: DIH: Standardize Evaluator parameter parsing and added helper
|
|
functions for parsing all evaluator parameters in a standard way.
|
|
(Noble Paul, shalin)
|
|
|
|
56. SOLR-1081: Change DIH EventListener to be an interface so that components
|
|
such as an EntityProcessor or a Transformer can act as an event listener.
|
|
(Noble Paul, shalin)
|
|
|
|
57. SOLR-1027: DIH: Alias the 'dataimporter' namespace to a shorter name 'dih'.
|
|
(Noble Paul via shalin)
|
|
|
|
58. SOLR-1084: Better error reporting when DIH entity name is a reserved word
|
|
and data-config.xml root node is not <dataConfig>.
|
|
(Noble Paul via shalin)
|
|
|
|
59. SOLR-1087: Deprecate 'where' attribute in CachedSqlEntityProcessor in
|
|
favor of cacheKey and cacheLookup. (Noble Paul via shalin)
|
|
|
|
60. SOLR-969: Change the FULL_DUMP, DELTA_DUMP, FIND_DELTA constants in DIH
|
|
Context to String. Change Context.currentProcess() to return a string
|
|
instead of an integer. (Kay Kay, Noble Paul, shalin)
|
|
|
|
61. SOLR-1120: Simplified DIH EntityProcessor API by moving logic for applying
|
|
transformers and handling multi-row outputs from Transformers into an
|
|
EntityProcessorWrapper class. The behavior of the method
|
|
EntityProcessor#destroy has been modified to be called once per parent-row
|
|
at the end of row. A new method EntityProcessor#close is added which is
|
|
called at the end of import. A new method
|
|
Context#getResolvedEntityAttribute is added which returns the resolved
|
|
value of an entity's attribute. Introduced a DocWrapper which takes care
|
|
of maintaining document level session variables.
|
|
(Noble Paul, shalin)
|
|
|
|
62. SOLR-1265: Add DIH variable resolving for URLDataSource properties like
|
|
baseUrl. (Chris Eldredge via ehatcher)
|
|
|
|
63. SOLR-1269: Better error messages from DIH JdbcDataSource when JDBC Driver
|
|
name or SQL is incorrect. (ehatcher, shalin)
|
|
|
|
|
|
Build
|
|
----------------------
|
|
1. SOLR-776: Added in ability to sign artifacts via Ant for releases (gsingers)
|
|
|
|
2. SOLR-854: Added run-example target (Mark Miller via ehatcher)
|
|
|
|
3. SOLR-1054:Fix dist-src target for DataImportHandler (Ryuuichi Kumai via shalin)
|
|
|
|
4. SOLR-1219: Added proxy.setup target (koji)
|
|
|
|
5. SOLR-1386: In build.xml, use longfile="gnu" in tar task to avoid warnings about long file names
|
|
(Mark Miller via shalin)
|
|
|
|
6. SOLR-1441: Make it possible to run all tests in a package (shalin)
|
|
|
|
|
|
Documentation
|
|
----------------------
|
|
1. SOLR-789: The javadoc of RandomSortField is not readable (Nicolas Lalevée via koji)
|
|
|
|
2. SOLR-962: Note about null handling in ModifiableSolrParams.add javadoc
|
|
(Kay Kay via hossman)
|
|
|
|
3. SOLR-1409: Added Solr Powered By Logos
|
|
|
|
4. SOLR-1369: Add HSQLDB Jar to example-DIH, unzip database and update
|
|
instructions.
|
|
|
|
|
|
================== Release 1.3.0 ==================
|
|
|
|
Upgrading from Solr 1.2
|
|
-----------------------
|
|
IMPORTANT UPGRADE NOTE: In a master/slave configuration, all searchers/slaves
|
|
should be upgraded before the master! If the master were to be updated
|
|
first, the older searchers would not be able to read the new index format.
|
|
|
|
The Porter snowball based stemmers in Lucene were updated (LUCENE-1142),
|
|
and are not guaranteed to be backward compatible at the index level
|
|
(the stem of certain words may have changed). Re-indexing is recommended.
|
|
|
|
Older Apache Solr installations can be upgraded by replacing
|
|
the relevant war file with the new version. No changes to configuration
|
|
files should be needed.
|
|
|
|
This version of Solr contains a new version of Lucene implementing
|
|
an updated index format. This version of Solr/Lucene can still read
|
|
and update indexes in the older formats, and will convert them to the new
|
|
format on the first index change. Be sure to backup your index before
|
|
upgrading in case you need to downgrade.
|
|
|
|
Solr now recognizes HTTP Request headers related to HTTP Caching (see
|
|
RFC 2616 sec13) and will by default respond with "304 Not Modified"
|
|
when appropriate. This should only affect users who access Solr via
|
|
an HTTP Cache, or via a Web-browser that has an internal cache, but if
|
|
you wish to suppress this behavior an '<httpCaching never304="true"/>'
|
|
option can be added to your solrconfig.xml. See the wiki (or the
|
|
example solrconfig.xml) for more details...
|
|
http://wiki.apache.org/solr/SolrConfigXml#HTTPCaching
|
|
|
|
In Solr 1.2, DateField did not enforce the canonical representation of
|
|
the ISO 8601 format when parsing incoming data, and did not generation
|
|
the canonical format when generating dates from "Date Math" strings
|
|
(particularly as it pertains to milliseconds ending in trailing zeros).
|
|
As a result equivalent dates could not always be compared properly.
|
|
This problem is corrected in Solr 1.3, but DateField users that might
|
|
have been affected by indexing inconsistent formats of equivilent
|
|
dates (ie: 1995-12-31T23:59:59Z vs 1995-12-31T23:59:59.000Z) may want
|
|
to consider reindexing to correct these inconsistencies. Users who
|
|
depend on some of the the "broken" behavior of DateField in Solr 1.2
|
|
(specificly: accepting any input that ends in a 'Z') should consider
|
|
using the LegacyDateField class as a possible alternative. Users that
|
|
desire 100% backwards compatibility should consider using the Solr 1.2
|
|
version of DateField.
|
|
|
|
Due to some changes in the lifecycle of TokenFilterFactories, users of
|
|
Solr 1.2 who have written Java code which constructs new instances of
|
|
StopFilterFactory, SynonymFilterFactory, or EnglishProterFilterFactory
|
|
will need to modify their code by adding a line like the following
|
|
prior to using the factory object...
|
|
factory.inform(SolrCore.getSolrCore().getSolrConfig().getResourceLoader());
|
|
These lifecycle changes do not affect people who use Solr "out of the
|
|
box" or who have developed their own TokenFilterFactory plugins. More
|
|
info can be found in SOLR-594.
|
|
|
|
The python client that used to ship with Solr is no longer included in
|
|
the distribution (see client/python/README.txt).
|
|
|
|
Detailed Change List
|
|
--------------------
|
|
|
|
New Features
|
|
1. SOLR-69: Adding MoreLikeThisHandler to search for similar documents using
|
|
lucene contrib/queries MoreLikeThis. MoreLikeThis is also available from
|
|
the StandardRequestHandler using ?mlt=true. (bdelacretaz, ryan)
|
|
|
|
2. SOLR-253: Adding KeepWordFilter and KeepWordFilterFactory. A TokenFilter
|
|
that keeps tokens with text in the registered keeplist. This behaves like
|
|
the inverse of StopFilter. (ryan)
|
|
|
|
3. SOLR-257: WordDelimiterFilter has a new parameter splitOnCaseChange,
|
|
which can be set to 0 to disable splitting "PowerShot" => "Power" "Shot".
|
|
(klaas)
|
|
|
|
4. SOLR-193: Adding SolrDocument and SolrInputDocument to represent documents
|
|
outside of the lucene Document infrastructure. This class will be used
|
|
by clients and for processing documents. (ryan)
|
|
|
|
5. SOLR-244: Added ModifiableSolrParams - a SolrParams implementation that
|
|
help you change values after initialization. (ryan)
|
|
|
|
6. SOLR-20: Added a java client interface with two implementations. One
|
|
implementation uses commons httpclient to connect to solr via HTTP. The
|
|
other connects to solr directly. Check client/java/solrj. This addition
|
|
also includes tests that start jetty and test a connection using the full
|
|
HTTP request cycle. (Darren Erik Vengroff, Will Johnson, ryan)
|
|
|
|
7. SOLR-133: Added StaxUpdateRequestHandler that uses StAX for XML parsing.
|
|
This implementation has much better error checking and lets you configure
|
|
a custom UpdateRequestProcessor that can selectively process update
|
|
requests depending on the request attributes. This class will likely
|
|
replace XmlUpdateRequestHandler. (Thorsten Scherler, ryan)
|
|
|
|
8. SOLR-264: Added RandomSortField, a utility field with a random sort order.
|
|
The seed is based on a hash of the field name, so a dynamic field
|
|
of this type is useful for generating different random sequences.
|
|
This field type should only be used for sorting or as a value source
|
|
in a FunctionQuery (ryan, hossman, yonik)
|
|
|
|
9. SOLR-266: Adding show=schema to LukeRequestHandler to show the parsed
|
|
schema fields and field types. (ryan)
|
|
|
|
10. SOLR-133: The UpdateRequestHandler now accepts multiple delete options
|
|
within a single request. For example, sending:
|
|
<delete><id>1</id><id>2</id></delete> will delete both 1 and 2. (ryan)
|
|
|
|
11. SOLR-269: Added UpdateRequestProcessor plugin framework. This provides
|
|
a reasonable place to process documents after they are parsed and
|
|
before they are committed to the index. This is a good place for custom
|
|
document manipulation or document based authorization. (yonik, ryan)
|
|
|
|
12. SOLR-260: Converting to a standard PluginLoader framework. This reworks
|
|
RequestHandlers, FieldTypes, and QueryResponseWriters to share the same
|
|
base code for loading and initializing plugins. This adds a new
|
|
configuration option to define the default RequestHandler and
|
|
QueryResponseWriter in XML using default="true". (ryan)
|
|
|
|
13. SOLR-225: Enable pluggable highlighting classes. Allow configurable
|
|
highlighting formatters and Fragmenters. (ryan)
|
|
|
|
14. SOLR-273/376/452/516: Added hl.maxAnalyzedChars highlighting parameter, defaulting
|
|
to 50k, hl.alternateField, which allows the specification of a backup
|
|
field to use as summary if no keywords are matched, and hl.mergeContiguous,
|
|
which combines fragments if they are adjacent in the source document.
|
|
(klaas, Grant Ingersoll, Koji Sekiguchi via klaas)
|
|
|
|
15. SOLR-291: Control maximum number of documents to cache for any entry
|
|
in the queryResultCache via queryResultMaxDocsCached solrconfig.xml
|
|
entry. (Koji Sekiguchi via yonik)
|
|
|
|
16. SOLR-240: New <lockType> configuration setting in <mainIndex> and
|
|
<indexDefaults> blocks supports all Lucene builtin LockFactories.
|
|
'single' is recommended setting, but 'simple' is default for total
|
|
backwards compatibility.
|
|
(Will Johnson via hossman)
|
|
|
|
17. SOLR-248: Added CapitalizationFilterFactory that creates tokens with
|
|
normalized capitalization. This filter is useful for facet display,
|
|
but will not work with a prefix query. (ryan)
|
|
SOLR-468: Change to the semantics to keep the original token, not the
|
|
token in the Map. Also switched to use Lucene's new reusable token
|
|
capabilities. (gsingers)
|
|
|
|
18. SOLR-307: Added NGramFilterFactory and EdgeNGramFilterFactory.
|
|
(Thomas Peuss via Otis Gospodnetic)
|
|
|
|
19. SOLR-305: analysis.jsp can be given a fieldtype instead of a field
|
|
name. (hossman)
|
|
|
|
20. SOLR-102: Added RegexFragmenter, which splits text for highlighting
|
|
based on a given pattern. (klaas)
|
|
|
|
21. SOLR-258: Date Faceting added to SimpleFacets. Facet counts
|
|
computed for ranges of size facet.date.gap (a DateMath expression)
|
|
between facet.date.start and facet.date.end. (hossman)
|
|
|
|
22. SOLR-196: A PHP serialized "phps" response writer that returns a
|
|
serialized array that can be used with the PHP function unserialize,
|
|
and a PHP response writer "php" that may be used by eval.
|
|
(Nick Jenkin, Paul Borgermans, Pieter Berkel via yonik)
|
|
|
|
23. SOLR-308: A new UUIDField class which accepts UUID string values,
|
|
as well as the special value of "NEW" which triggers generation of
|
|
a new random UUID.
|
|
(Thomas Peuss via hossman)
|
|
|
|
24. SOLR-349: New FunctionQuery functions: sum, product, div, pow, log,
|
|
sqrt, abs, scale, map. Constants may now be used as a value source.
|
|
(yonik)
|
|
|
|
25. SOLR-359: Add field type className to Luke response, and enabled access
|
|
to the detailed field information from the solrj client API.
|
|
(Grant Ingersoll via ehatcher)
|
|
|
|
26. SOLR-334: Pluggable query parsers. Allows specification of query
|
|
type and arguments as a prefix on a query string. (yonik)
|
|
|
|
27. SOLR-351: External Value Source. An external file may be used
|
|
to specify the values of a field, currently usable as
|
|
a ValueSource in a FunctionQuery. (yonik)
|
|
|
|
28. SOLR-395: Many new features for the spell checker implementation, including
|
|
an extended response mode with much richer output, multi-word spell checking,
|
|
and a bevy of new and renamed options (see the wiki).
|
|
(Mike Krimerman, Scott Taber via klaas).
|
|
|
|
29. SOLR-408: Added PingRequestHandler and deprecated SolrCore.getPingQueryRequest().
|
|
Ping requests should be configured using standard RequestHandler syntax in
|
|
solrconfig.xml rather then using the <pingQuery></pingQuery> syntax.
|
|
(Karsten Sperling via ryan)
|
|
|
|
30. SOLR-281: Added a 'Search Component' interface and converted StandardRequestHandler
|
|
and DisMaxRequestHandler to use this framework.
|
|
(Sharad Agarwal, Henri Biestro, yonik, ryan)
|
|
|
|
31. SOLR-176: Add detailed timing data to query response output. The SearchHandler
|
|
interface now returns how long each section takes. (klaas)
|
|
|
|
32. SOLR-414: Plugin initialization now supports SolrCore and ResourceLoader "Aware"
|
|
plugins. Plugins that implement SolrCoreAware or ResourceLoaderAware are
|
|
informed about the SolrCore/ResourceLoader. (Henri Biestro, ryan)
|
|
|
|
33. SOLR-350: Support multiple SolrCores running in the same solr instance and allows
|
|
runtime runtime management for any running SolrCore. If a solr.xml file exists
|
|
in solr.home, this file is used to instanciate multiple cores and enables runtime
|
|
core manipulation. For more informaion see: http://wiki.apache.org/solr/CoreAdmin
|
|
(Henri Biestro, ryan)
|
|
|
|
34. SOLR-447: Added an single request handler that will automatically register all
|
|
standard admin request handlers. This replaces the need to register (and maintain)
|
|
the set of admin request handlers. Assuming solrconfig.xml includes:
|
|
<requestHandler name="/admin/" class="org.apache.solr.handler.admin.AdminHandlers" />
|
|
This will register: Luke/SystemInfo/PluginInfo/ThreadDump/PropertiesRequestHandler.
|
|
(ryan)
|
|
|
|
35. SOLR-142: Added RawResponseWriter and ShowFileRequestHandler. This returns config
|
|
files directly. If AdminHandlers are configured, this will be added automatically.
|
|
The jsp files /admin/get-file.jsp and /admin/raw-schema.jsp have been deprecated.
|
|
The deprecated <admin><gettableFiles> will be automatically registered with
|
|
a ShowFileRequestHandler instance for backwards compatibility. (ryan)
|
|
|
|
36. SOLR-446: TextResponseWriter can write SolrDocuments and SolrDocumentLists the
|
|
same way it writes Document and DocList. (yonik, ryan)
|
|
|
|
37. SOLR-418: Adding a query elevation component. This is an optional component to
|
|
elevate some documents to the top positions (or exclude them) for a given query.
|
|
(ryan)
|
|
|
|
38. SOLR-478: Added ability to get back unique key information from the LukeRequestHandler.
|
|
(gsingers)
|
|
|
|
39. SOLR-127: HTTP Caching awareness. Solr now recognizes HTTP Request
|
|
headers related to HTTP Caching (see RFC 2616 sec13) and will respond
|
|
with "304 Not Modified" when appropriate. New options have been added
|
|
to solrconfig.xml to influence this behavior.
|
|
(Thomas Peuss via hossman)
|
|
|
|
40. SOLR-303: Distributed Search over HTTP. Specification of shards
|
|
argument causes Solr to query those shards and merge the results
|
|
into a single response. Querying, field faceting (sorted only),
|
|
query faceting, highlighting, and debug information are supported
|
|
in distributed mode.
|
|
(Sharad Agarwal, Patrick O'Leary, Sabyasachi Dalal, Stu Hood,
|
|
Jayson Minard, Lars Kotthoff, ryan, yonik)
|
|
|
|
41. SOLR-356: Pluggable functions (value sources) that allow
|
|
registration of new functions via solrconfig.xml
|
|
(Doug Daniels via yonik)
|
|
|
|
42. SOLR-494: Added cool admin Ajaxed schema explorer.
|
|
(Greg Ludington via ehatcher)
|
|
|
|
43. SOLR-497: Added date faceting to the QueryResponse in SolrJ
|
|
and QueryResponseTest (Shalin Shekhar Mangar via gsingers)
|
|
|
|
44. SOLR-486: Binary response format, faster and smaller
|
|
than XML and JSON response formats (use wt=javabin).
|
|
BinaryResponseParser for utilizing the binary format via SolrJ
|
|
and is now the default.
|
|
(Noble Paul, yonik)
|
|
|
|
45. SOLR-521: StopFilterFactory support for "enablePositionIncrements"
|
|
(Walter Ferrara via hossman)
|
|
|
|
46. SOLR-557: Added SolrCore.getSearchComponents() to return an unmodifiable Map. (gsingers)
|
|
|
|
47. SOLR-516: Added hl.maxAlternateFieldLength parameter, to set max length for hl.alternateField
|
|
(Koji Sekiguchi via klaas)
|
|
|
|
48. SOLR-319: Changed SynonymFilterFactory to "tokenize" synonyms file.
|
|
To use a tokenizer, specify "tokenizerFactory" attribute in <filter>.
|
|
For example:
|
|
<tokenizer class="solr.CJKTokenizerFactory"/>
|
|
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" expand="true"
|
|
ignoreCase="true" tokenizerFactory="solr.CJKTokenizerFactory"/>
|
|
(koji)
|
|
|
|
49. SOLR-515: Added SimilarityFactory capability to schema.xml,
|
|
making config file parameters usable in the construction of
|
|
the global Lucene Similarity implementation.
|
|
(ehatcher)
|
|
|
|
50. SOLR-536: Add a DocumentObjectBinder to solrj that converts Objects to and
|
|
from SolrDocuments. (Noble Paul via ryan)
|
|
|
|
51. SOLR-595: Add support for Field level boosting in the MoreLikeThis Handler.
|
|
(Tom Morton, gsingers)
|
|
|
|
52. SOLR-572: Added SpellCheckComponent and org.apache.solr.spelling package to support more spell
|
|
checking functionality. Also includes ability to add your own SolrSpellChecker implementation that
|
|
plugs in. See http://wiki.apache.org/solr/SpellCheckComponent for more details
|
|
(Shalin Shekhar Mangar, Bojan Smid, gsingers)
|
|
|
|
53. SOLR-679: Added accessor methods to Lucene based spell checkers (gsingers)
|
|
|
|
54. SOLR-423: Added Request Handler close hook notification so that RequestHandlers can be notified
|
|
when a core is closing. (gsingers, ryan)
|
|
|
|
55. SOLR-603: Added ability to partially optimize. (gsingers)
|
|
|
|
56. SOLR-483: Add byte/short sorting support (gsingers)
|
|
|
|
57. SOLR-14: Add preserveOriginal flag to WordDelimiterFilter
|
|
(Geoffrey Young, Trey Hyde, Ankur Madnani, yonik)
|
|
|
|
58. SOLR-502: Add search timeout support. (Sean Timm via yonik)
|
|
|
|
59. SOLR-605: Add the ability to register callbacks programatically (ryan, Noble Paul)
|
|
|
|
60. SOLR-610: hl.maxAnalyzedChars can be -1 to highlight everything (Lars Kotthoff via klaas)
|
|
|
|
61. SOLR-522: Make analysis.jsp show payloads. (Tricia Williams via yonik)
|
|
|
|
62. SOLR-611: Expose sort_values returned by QueryComponent in SolrJ's QueryResponse
|
|
(Dan Rosher via shalin)
|
|
|
|
63. SOLR-256: Support exposing Solr statistics through JMX (Sharad Agrawal, shalin)
|
|
|
|
64. SOLR-666: Expose warmup time in statistics for SolrIndexSearcher and LRUCache (shalin)
|
|
|
|
65. SOLR-663: Allow multiple files for stopwords, keepwords, protwords and synonyms
|
|
(Otis Gospodnetic, shalin)
|
|
|
|
66. SOLR-469: Added DataImportHandler as a contrib project which makes indexing data from Databases,
|
|
XML files and HTTP data sources into Solr quick and easy. Includes API and implementations for
|
|
supporting multiple data sources, processors and transformers for importing data. Supports full
|
|
data imports as well as incremental (delta) indexing. See http://wiki.apache.org/solr/DataImportHandler
|
|
for more details. (Noble Paul, shalin)
|
|
|
|
67. SOLR-622: SpellCheckComponent supports auto-loading indices on startup and optionally, (re)builds
|
|
indices on newSearcher event, if configured in solrconfig.xml (shalin)
|
|
|
|
68. SOLR-554: Hierarchical JDK log level selector for SOLR Admin replaces logging.jsp
|
|
(Sean Timm via shalin)
|
|
|
|
69. SOLR-506: Emitting HTTP Cache headers can be enabled or disabled through configuration on a
|
|
per-handler basis (shalin)
|
|
|
|
70. SOLR-716: Added support for properties in configuration files. Properties can be specified in
|
|
solr.xml and can be used in solrconfig.xml and schema.xml (Henri Biestro, hossman, ryan, shalin)
|
|
|
|
71. SOLR-1129 : Support binding dynamic fields to beans in SolrJ (Avlesh Singh , noble)
|
|
|
|
72. SOLR-920 : Cache and reuse IndexSchema . A new attribute added in solr.xml called 'shareSchema' (noble)
|
|
|
|
73. SOLR-700: DIH: Allow configurable locales through a locale attribute in
|
|
fields for NumberFormatTransformer. (Stefan Oestreicher, shalin)
|
|
|
|
Changes in runtime behavior
|
|
1. SOLR-559: use Lucene updateDocument, deleteDocuments methods. This
|
|
removes the maxBufferedDeletes parameter added by SOLR-310 as Lucene
|
|
now manages the deletes. This provides slightly better indexing
|
|
performance and makes overwrites atomic, eliminating the possibility of
|
|
a crash causing duplicates. (yonik)
|
|
|
|
2. SOLR-689 / SOLR-695: If you have used "MultiCore" functionality in an unreleased
|
|
version of 1.3-dev, many classes and configs have been renamed for the official
|
|
1.3 release. Speciffically, solr.xml has replaced multicore.xml, and uses a slightly
|
|
different syntax. The solrj classes: MultiCore{Request/Response/Params} have been
|
|
renamed: CoreAdmin{Request/Response/Params} (hossman, ryan, Henri Biestro)
|
|
|
|
3. SOLR-647: reference count the SolrCore uses to prevent a premature
|
|
close while a core is still in use. (Henri Biestro, Noble Paul, yonik)
|
|
|
|
4. SOLR-737: SolrQueryParser now uses a ConstantScoreQuery for wildcard
|
|
queries that prevent an exception from being thrown when the number
|
|
of matching terms exceeds the BooleanQuery clause limit. (yonik)
|
|
|
|
Optimizations
|
|
1. SOLR-276: improve JSON writer speed. (yonik)
|
|
|
|
2. SOLR-310: bound and reduce memory usage by providing <maxBufferedDeletes> parameter,
|
|
which flushes deleted without forcing the user to use <commit/> for this purpose.
|
|
(klaas)
|
|
|
|
3. SOLR-348: short-circuit faceting if less than mincount docs match. (yonik)
|
|
|
|
4. SOLR-354: Optimize removing all documents. Now when a delete by query
|
|
of *:* is issued, the current index is removed. (yonik)
|
|
|
|
5. SOLR-377: Speed up response writers. (yonik)
|
|
|
|
6. SOLR-342: Added support into the SolrIndexWriter for using several new features of the new
|
|
LuceneIndexWriter, including: setRAMBufferSizeMB(), setMergePolicy(), setMergeScheduler.
|
|
Also, added support to specify Lucene's autoCommit functionality (not to be confused with Solr's
|
|
similarily named autoCommit functionality) via the <luceneAutoCommit> config. item. See the test
|
|
and example solrconfig.xml <indexDefaults> section for usage. Performance during indexing should
|
|
be significantly increased by moving up to 2.3 due to Lucene's new indexing capabilities.
|
|
Furthermore, the setRAMBufferSizeMB makes it more logical to decide on tuning factors related to
|
|
indexing. For best performance, leave the mergePolicy and mergeScheduler as the defaults and set
|
|
ramBufferSizeMB instead of maxBufferedDocs. The best value for this depends on the types of
|
|
documents in use. 32 should be a good starting point, but reports have shown up to 48 MB provides
|
|
good results. Note, it is acceptable to set both ramBufferSizeMB and maxBufferedDocs, and Lucene
|
|
will flush based on whichever limit is reached first. (gsingers)
|
|
|
|
7. SOLR-330: Converted TokenStreams to use Lucene's new char array based
|
|
capabilities. (gsingers)
|
|
|
|
8. SOLR-624: Only take snapshots if there are differences to the index (Richard Trey Hyde via gsingers)
|
|
|
|
9. SOLR-587: Delete by Query performance greatly improved by using
|
|
new underlying Lucene IndexWriter implementation. (yonik)
|
|
|
|
10. SOLR-730: Use read-only IndexReaders that don't synchronize
|
|
isDeleted(). This will speed up function queries and *:* queries
|
|
as well as improve their scalability on multi-CPU systems.
|
|
(Mark Miller via yonik)
|
|
|
|
Bug Fixes
|
|
1. Make TextField respect sortMissingFirst and sortMissingLast fields.
|
|
(J.J. Larrea via yonik)
|
|
|
|
2. autoCommit/maxDocs was not working properly when large autoCommit/maxTime
|
|
was specified (klaas)
|
|
|
|
3. SOLR-283: autoCommit was not working after delete. (ryan)
|
|
|
|
4. SOLR-286: ContentStreamBase was not using default encoding for getBytes()
|
|
(Toru Matsuzawa via ryan)
|
|
|
|
5. SOLR-292: Fix MoreLikeThis facet counting. (Pieter Berkel via ryan)
|
|
|
|
6. SOLR-297: Fix bug in RequiredSolrParams where requiring a field
|
|
specific param would fail if a general default value had been supplied.
|
|
(hossman)
|
|
|
|
7. SOLR-331: Fix WordDelimiterFilter handling of offsets for synonyms or
|
|
other injected tokens that can break highlighting. (yonik)
|
|
|
|
8. SOLR-282: Snapshooter does not work on Solaris and OS X since the cp command
|
|
there does not have the -l option. Also updated commit/optimize related
|
|
scripts to handle both old and new response format. (bill)
|
|
|
|
9. SOLR-294: Logging of elapsed time broken on Solaris because the date command
|
|
there does not support the %s output format. (bill)
|
|
|
|
10. SOLR-136: Snappuller - "date -d" and locales don't mix. (Jürgen Hermann via bill)
|
|
|
|
11. SOLR-333: Changed distributiondump.jsp to use Solr HOME instead of CWD to set path.
|
|
|
|
12. SOLR-393: Removed duplicate contentType from raw-schema.jsp. (bill)
|
|
|
|
13. SOLR-413: Requesting a large numbers of documents to be returned (limit)
|
|
can result in an out-of-memory exception, even for a small index. (yonik)
|
|
|
|
14. The CSV loader incorrectly threw an exception when given
|
|
header=true (the default). (ryan, yonik)
|
|
|
|
15. SOLR-449: the python and ruby response writers are now able to correctly
|
|
output NaN and Infinity in their respective languages. (klaas)
|
|
|
|
16. SOLR-42: HTMLStripReader tokenizers now preserve correct source
|
|
offsets for highlighting. (Grant Ingersoll via yonik)
|
|
|
|
17. SOLR-481: Handle UnknownHostException in _info.jsp (gsingers)
|
|
|
|
18. SOLR-324: Add proper support for Long and Doubles in sorting, etc. (gsingers)
|
|
|
|
19. SOLR-496: Cache-Control max-age changed to Long so Expires
|
|
calculation won't cause overflow. (Thomas Peuss via hossman)
|
|
|
|
20. SOLR-535: Fixed typo (Tokenzied -> Tokenized) in schema.jsp (Thomas Peuss via billa)
|
|
|
|
21. SOLR-529: Better error messages from SolrQueryParser when field isn't
|
|
specified and there is no defaultSearchField in schema.xml
|
|
(Lars Kotthoff via hossman)
|
|
|
|
22. SOLR-530: Better error messages/warnings when parsing schema.xml:
|
|
field using bogus fieldtype and multiple copyFields to a non-multiValue
|
|
field. (Shalin Shekhar Mangar via hossman)
|
|
|
|
23. SOLR-528: Better error message when defaultSearchField is bogus or not
|
|
indexed. (Lars Kotthoff via hossman)
|
|
|
|
24. SOLR-533: Fixed tests so they don't use hardcoded port numbers.
|
|
(hossman)
|
|
|
|
25. SOLR-400: SolrExceptionTest should now handle using OpenDNS as a DNS provider (gsingers)
|
|
|
|
26. SOLR-541: Legacy XML update support (provided by SolrUpdateServlet
|
|
when no RequestHandler is mapped to "/update") now logs error correctly.
|
|
(hossman)
|
|
|
|
27. SOLR-267: Changed logging to report number of hits, and also provide a mechanism to add log
|
|
messages to be output by the SolrCore via a NamedList toLog member variable.
|
|
(Will Johnson, yseeley, gsingers)
|
|
|
|
- SOLR-267: Removed adding values to the HTTP headers in SolrDispatchFilter (gsingers)
|
|
|
|
28. SOLR-509: Moved firstSearcher event notification to the end of the SolrCore constructor
|
|
(Koji Sekiguchi via gsingers)
|
|
|
|
29. SOLR-470, SOLR-552, SOLR-544, SOLR-701: Multiple fixes to DateField
|
|
regarding lenient parsing of optional milliseconds, and correct
|
|
formating using the canonical representation. LegacyDateField has
|
|
been added for people who have come to depend on the existing
|
|
broken behavior. (hossman, Stefan Oestreicher)
|
|
|
|
30. SOLR-539: Fix for non-atomic long counters and a cast fix to avoid divide
|
|
by zero. (Sean Timm via Otis Gospodnetic)
|
|
|
|
31. SOLR-514: Added explicit media-type with UTF* charset to *.xsl files that
|
|
don't already have one. (hossman)
|
|
|
|
32. SOLR-505: Give RequestHandlers the possiblity to suppress the generation
|
|
of HTTP caching headers. (Thomas Peuss via Otis Gospodnetic)
|
|
|
|
33. SOLR-553: Handle highlighting of phrase terms better when
|
|
hl.usePhraseHighligher=true URL param is used.
|
|
(Bojan Smid via Otis Gospodnetic)
|
|
|
|
34. SOLR-590: Limitation in pgrep on Linux platform breaks script-utils fixUser.
|
|
(Hannes Schmidt via billa)
|
|
|
|
35. SOLR-597: SolrServlet no longer "caches" SolrCore. This was causing
|
|
problems in Resin, and could potentially cause problems for customized
|
|
usages of SolrServlet.
|
|
|
|
36. SOLR-585: Now sets the QParser on the ResponseBuilder (gsingers)
|
|
|
|
37. SOLR-604: If the spellchecking path is relative, make it relative to the Solr Data Directory.
|
|
(Shalin Shekhar Mangar via gsingers)
|
|
|
|
38. SOLR-584: Make stats.jsp and stats.xsl more robust.
|
|
(Yousef Ourabi and hossman)
|
|
|
|
39. SOLR-443: SolrJ: Declare UTF-8 charset on POSTed parameters
|
|
to avoid problems with servlet containers that default to latin-1
|
|
and allow switching of the exact POST mechanism for parameters
|
|
via useMultiPartPost in CommonsHttpSolrServer.
|
|
(Lars Kotthoff, Andrew Schurman, ryan, yonik)
|
|
|
|
40. SOLR-556: multi-valued fields always highlighted in disparate snippets
|
|
(Lars Kotthoff via klaas)
|
|
|
|
41. SOLR-501: Fix admin/analysis.jsp UTF-8 input for some other servlet
|
|
containers such as Tomcat. (Hiroaki Kawai, Lars Kotthoff via yonik)
|
|
|
|
42. SOLR-616: SpellChecker accuracy configuration is not applied for FileBasedSpellChecker.
|
|
Apply it for FileBasedSpellChecker and IndexBasedSpellChecker both.
|
|
(shalin)
|
|
|
|
43. SOLR-648: SpellCheckComponent throws NullPointerException on using spellcheck.q request
|
|
parameter after restarting Solr, if reload is called but build is not called.
|
|
(Jonathan Lee, shalin)
|
|
|
|
44. SOLR-598: DebugComponent now always occurs last in the SearchHandler list unless the
|
|
components are explicitly declared. (gsingers)
|
|
|
|
45. SOLR-676: DataImportHandler should use UpdateRequestProcessor API instead of directly
|
|
using UpdateHandler. (shalin)
|
|
|
|
46. SOLR-696: Fixed bug in NamedListCodec in regards to serializing Iterable objects. (gsingers)
|
|
|
|
47. SOLR-669: snappuler fix for FreeBSD/Darwin (Richard "Trey" Hyde via Otis Gospodnetic)
|
|
|
|
48. SOLR-606: Fixed spell check collation offset issue. (Stefan Oestreicher , Geoffrey Young, gsingers)
|
|
|
|
49. SOLR-589: Improved handling of badly formated query strings (Sean Timm via Otis Gospodnetic)
|
|
|
|
50. SOLR-749: Allow QParser and ValueSourceParsers to be extended with same name (hossman, gsingers)
|
|
|
|
51. SOLR-704: DIH NumberFormatTransformer can silently ignore part of the
|
|
string while parsing. Now it tries to use the complete string for parsing.
|
|
Failure to do so will result in an exception.
|
|
(Stefan Oestreicher via shalin)
|
|
|
|
52. SOLR-729: DIH Context.getDataSource(String) gives current entity's
|
|
DataSource instance regardless of argument. (Noble Paul, shalin)
|
|
|
|
53. SOLR-726: DIH: Jdbc Drivers and DataSources fail to load if placed in
|
|
multicore sharedLib or core's lib directory.
|
|
(Walter Ferrara, Noble Paul, shalin)
|
|
|
|
Other Changes
|
|
1. SOLR-135: Moved common classes to org.apache.solr.common and altered the
|
|
build scripts to make two jars: apache-solr-1.3.jar and
|
|
apache-solr-1.3-common.jar. This common.jar can be used in client code;
|
|
It does not have lucene or junit dependencies. The original classes
|
|
have been replaced with a @Deprecated extended class and are scheduled
|
|
to be removed in a later release. While this change does not affect API
|
|
compatibility, it is recommended to update references to these
|
|
deprecated classes. (ryan)
|
|
|
|
2. SOLR-268: Tweaks to post.jar so it prints the error message from Solr.
|
|
(Brian Whitman via hossman)
|
|
|
|
3. Upgraded to Lucene 2.2.0; June 18, 2007.
|
|
|
|
4. SOLR-215: Static access to SolrCore.getSolrCore() and SolrConfig.config
|
|
have been deprecated in order to support multiple loaded cores.
|
|
(Henri Biestro via ryan)
|
|
|
|
5. SOLR-367: The create method in all TokenFilter and Tokenizer Factories
|
|
provided by Solr now declare their specific return types instead of just
|
|
using "TokenStream" (hossman)
|
|
|
|
6. SOLR-396: Hooks add to build system for automatic generation of (stub)
|
|
Tokenizer and TokenFilter Factories.
|
|
Also: new Factories for all Tokenizers and TokenFilters provided by the
|
|
lucene-analyzers-2.2.0.jar -- includes support for German, Chinese,
|
|
Russan, Dutch, Greek, Brazilian, Thai, and French. (hossman)
|
|
|
|
7. Upgraded to commons-CSV r609327, which fixes escaping bugs and
|
|
introduces new escaping and whitespace handling options to
|
|
increase compatibility with different formats. (yonik)
|
|
|
|
8. Upgraded to Lucene 2.3.0; Jan 23, 2008.
|
|
|
|
9. SOLR-451: Changed analysis.jsp to use POST instead of GET, also made the input area a
|
|
bit bigger (gsingers)
|
|
|
|
10. Upgrade to Lucene 2.3.1
|
|
|
|
11. SOLR-531: Different exit code for rsyncd-start and snappuller if disabled (Thomas Peuss via billa)
|
|
|
|
12. SOLR-550: Clarified DocumentBuilder addField javadocs (gsingers)
|
|
|
|
13. Upgrade to Lucene 2.3.2
|
|
|
|
14. SOLR-518: Changed luke.xsl to use divs w/css for generating histograms
|
|
instead of SVG (Thomas Peuss via hossman)
|
|
|
|
15. SOLR-592: Added ShardParams interface and changed several string literals
|
|
to references to constants in CommonParams.
|
|
(Lars Kotthoff via Otis Gospodnetic)
|
|
|
|
16. SOLR-520: Deprecated unused LengthFilter since already core in
|
|
Lucene-Java (hossman)
|
|
|
|
17. SOLR-645: Refactored SimpleFacetsTest (Lars Kotthoff via hossman)
|
|
|
|
18. SOLR-591: Changed Solrj default value for facet.sort to true (Lars Kotthoff via Shalin)
|
|
|
|
19. Upgraded to Lucene 2.4-dev (r669476) to support SOLR-572 (gsingers)
|
|
|
|
20. SOLR-636: Improve/simplify example configs; and make index.jsp
|
|
links more resilient to configs loaded via an InputStream
|
|
(Lars Kotthoff, hossman)
|
|
|
|
21. SOLR-682: Scripts now support FreeBSD (Richard Trey Hyde via gsingers)
|
|
|
|
22. SOLR-489: Added in deprecation comments. (Sean Timm, Lars Kothoff via gsingers)
|
|
|
|
23. SOLR-692: Migrated to stable released builds of StAX API 1.0.1 and StAX 1.2.0 (shalin)
|
|
24. Upgraded to Lucene 2.4-dev (r686801) (yonik)
|
|
25. Upgraded to Lucene 2.4-dev (r688745) 27-Aug-2008 (yonik)
|
|
26. Upgraded to Lucene 2.4-dev (r691741) 03-Sep-2008 (yonik)
|
|
27. Replaced the StAX reference implementation with the geronimo
|
|
StAX API jar, and the Woodstox StAX implementation. (yonik)
|
|
|
|
Build
|
|
1. SOLR-411. Changed the names of the Solr JARs to use the defacto standard JAR names based on
|
|
project-name-version.jar. This yields, for example:
|
|
apache-solr-common-1.3-dev.jar
|
|
apache-solr-solrj-1.3-dev.jar
|
|
apache-solr-1.3-dev.jar
|
|
|
|
2. SOLR-479: Added clover code coverage targets for committers and the nightly build. Requires
|
|
the Clover library, as licensed to Apache and only available privately. To run:
|
|
ant -Drun.clover=true clean clover test generate-clover-reports
|
|
|
|
3. SOLR-510: Nightly release includes client sources. (koji)
|
|
|
|
4. SOLR-563: Modified the build process to build contrib projects
|
|
(Shalin Shekhar Mangar via Otis Gospodnetic)
|
|
|
|
5. SOLR-673: Modify build file to create javadocs for core, solrj, contrib and "all inclusive" (shalin)
|
|
|
|
6. SOLR-672: Nightly release includes contrib sources. (Jeremy Hinegardner, shalin)
|
|
|
|
7. SOLR-586: Added ant target and POM files for building maven artifacts of the Solr core, common,
|
|
client and contrib. The target can publish artifacts with source and javadocs.
|
|
(Spencer Crissman, Craig McClanahan, shalin)
|
|
|
|
================== Release 1.2 ==================
|
|
|
|
Upgrading from Solr 1.1
|
|
-------------------------------------
|
|
IMPORTANT UPGRADE NOTE: In a master/slave configuration, all searchers/slaves
|
|
should be upgraded before the master! If the master were to be updated
|
|
first, the older searchers would not be able to read the new index format.
|
|
|
|
Older Apache Solr installations can be upgraded by replacing
|
|
the relevant war file with the new version. No changes to configuration
|
|
files should be needed.
|
|
|
|
This version of Solr contains a new version of Lucene implementing
|
|
an updated index format. This version of Solr/Lucene can still read
|
|
and update indexes in the older formats, and will convert them to the new
|
|
format on the first index change. One change in the new index format
|
|
is that all "norms" are kept in a single file, greatly reducing the number
|
|
of files per segment. Users of compound file indexes will want to consider
|
|
converting to the non-compound format for faster indexing and slightly better
|
|
search concurrency.
|
|
|
|
The JSON response format for facets has changed to make it easier for
|
|
clients to retain sorted order. Use json.nl=map explicitly in clients
|
|
to get the old behavior, or add it as a default to the request handler
|
|
in solrconfig.xml
|
|
|
|
The Lucene based Solr query syntax is slightly more strict.
|
|
A ':' in a field value must be escaped or the whole value must be quoted.
|
|
|
|
The Solr "Request Handler" framework has been updated in two key ways:
|
|
First, if a Request Handler is registered in solrconfig.xml with a name
|
|
starting with "/" then it can be accessed using path-based URL, instead of
|
|
using the legacy "/select?qt=name" URL structure. Second, the Request
|
|
Handler framework has been extended making it possible to write Request
|
|
Handlers that process streams of data for doing updates, and there is a
|
|
new-style Request Handler for XML updates given the name of "/update" in
|
|
the example solrconfig.xml. Existing installations without this "/update"
|
|
handler will continue to use the old update servlet and should see no
|
|
changes in behavior. For new-style update handlers, errors are now
|
|
reflected in the HTTP status code, Content-type checking is more strict,
|
|
and the response format has changed and is controllable via the wt
|
|
parameter.
|
|
|
|
|
|
|
|
Detailed Change List
|
|
--------------------
|
|
|
|
New Features
|
|
1. SOLR-82: Default field values can be specified in the schema.xml.
|
|
(Ryan McKinley via hossman)
|
|
|
|
2. SOLR-89: Two new TokenFilters with corresponding Factories...
|
|
* TrimFilter - Trims leading and trailing whitespace from Tokens
|
|
* PatternReplaceFilter - applies a Pattern to each token in the
|
|
stream, replacing match occurances with a specified replacement.
|
|
(hossman)
|
|
|
|
3. SOLR-91: allow configuration of a limit of the number of searchers
|
|
that can be warming in the background. This can be used to avoid
|
|
out-of-memory errors, or contention caused by more and more searchers
|
|
warming in the background. An error is thrown if the limit specified
|
|
by maxWarmingSearchers in solrconfig.xml is exceeded. (yonik)
|
|
|
|
4. SOLR-106: New faceting parameters that allow specification of a
|
|
minimum count for returned facets (facet.mincount), paging through facets
|
|
(facet.offset, facet.limit), and explicit sorting (facet.sort).
|
|
facet.zeros is now deprecated. (yonik)
|
|
|
|
5. SOLR-80: Negative queries are now allowed everywhere. Negative queries
|
|
are generated and cached as their positive counterpart, speeding
|
|
generation and generally resulting in smaller sets to cache.
|
|
Set intersections in SolrIndexSearcher are more efficient,
|
|
starting with the smallest positive set, subtracting all negative
|
|
sets, then intersecting with all other positive sets. (yonik)
|
|
|
|
6. SOLR-117: Limit a field faceting to constraints with a prefix specified
|
|
by facet.prefix or f.<field>.facet.prefix. (yonik)
|
|
|
|
7. SOLR-107: JAVA API: Change NamedList to use Java5 generics
|
|
and implement Iterable<Map.Entry> (Ryan McKinley via yonik)
|
|
|
|
8. SOLR-104: Support for "Update Plugins" -- RequestHandlers that want
|
|
access to streams of data for doing updates. ContentStreams can come
|
|
from the raw POST body, multi-part form data, or remote URLs.
|
|
Included in this change is a new SolrDispatchFilter that allows
|
|
RequestHandlers registered with names that begin with a "/" to be
|
|
accessed using a URL structure based on that name.
|
|
(Ryan McKinley via hossman)
|
|
|
|
9. SOLR-126: DirectUpdateHandler2 supports autocommitting after a specified time
|
|
(in ms), using <autoCommit><maxTime>10000</maxTime></autoCommit>.
|
|
(Ryan McKinley via klaas).
|
|
|
|
10. SOLR-116: IndexInfoRequestHandler added. (Erik Hatcher)
|
|
|
|
11. SOLR-79: Add system property ${<sys.prop>[:<default>]} substitution for
|
|
configuration files loaded, including schema.xml and solrconfig.xml.
|
|
(Erik Hatcher with inspiration from Andrew Saar)
|
|
|
|
12. SOLR-149: Changes to make Solr more easily embeddable, in addition
|
|
to logging which request handler handled each request.
|
|
(Ryan McKinley via yonik)
|
|
|
|
13. SOLR-86: Added standalone Java-based command-line updater.
|
|
(Erik Hatcher via Bertrand Delecretaz)
|
|
|
|
14. SOLR-152: DisMaxRequestHandler now supports configurable alternate
|
|
behavior when q is not specified. A "q.alt" param can be specified
|
|
using SolrQueryParser syntax as a mechanism for specifying what query
|
|
the dismax handler should execute if the main user query (q) is blank.
|
|
(Ryan McKinley via hossman)
|
|
|
|
15. SOLR-158: new "qs" (Query Slop) param for DisMaxRequestHandler
|
|
allows for specifying the amount of default slop to use when parsing
|
|
explicit phrase queries from the user.
|
|
(Adam Hiatt via hossman)
|
|
|
|
16. SOLR-81: SpellCheckerRequestHandler that uses the SpellChecker from
|
|
the Lucene contrib.
|
|
(Otis Gospodnetic and Adam Hiatt)
|
|
|
|
17. SOLR-182: allow lazy loading of request handlers on first request.
|
|
(Ryan McKinley via yonik)
|
|
|
|
18. SOLR-81: More SpellCheckerRequestHandler enhancements, inlcluding
|
|
support for relative or absolute directory path configurations, as
|
|
well as RAM based directory. (hossman)
|
|
|
|
19. SOLR-197: New parameters for input: stream.contentType for specifying
|
|
or overriding the content type of input, and stream.file for reading
|
|
local files. (Ryan McKinley via yonik)
|
|
|
|
20. SOLR-66: CSV data format for document additions and updates. (yonik)
|
|
|
|
21. SOLR-184: add echoHandler=true to responseHeader, support echoParams=all
|
|
(Ryan McKinley via ehatcher)
|
|
|
|
22. SOLR-211: Added a regex PatternTokenizerFactory. This extracts tokens
|
|
from the input string using a regex Pattern. (Ryan McKinley)
|
|
|
|
23. SOLR-162: Added a "Luke" request handler and other admin helpers.
|
|
This exposes the system status through the standard requestHandler
|
|
framework. (ryan)
|
|
|
|
24. SOLR-212: Added a DirectSolrConnection class. This lets you access
|
|
solr using the standard request/response formats, but does not require
|
|
an HTTP connection. It is designed for embedded applications. (ryan)
|
|
|
|
25. SOLR-204: The request dispatcher (added in SOLR-104) can handle
|
|
calls to /select. This offers uniform error handling for /update and
|
|
/select. To enable this behavior, you must add:
|
|
<requestDispatcher handleSelect="true" > to your solrconfig.xml
|
|
See the example solrconfig.xml for details. (ryan)
|
|
|
|
26. SOLR-170: StandardRequestHandler now supports a "sort" parameter.
|
|
Using the ';' syntax is still supported, but it is recommended to
|
|
transition to the new syntax. (ryan)
|
|
|
|
27. SOLR-181: The index schema now supports "required" fields. Attempts
|
|
to add a document without a required field will fail, returning a
|
|
descriptive error message. By default, the uniqueKey field is
|
|
a required field. This can be disabled by setting required=false
|
|
in schema.xml. (Greg Ludington via ryan)
|
|
|
|
28. SOLR-217: Fields configured in the schema to be neither indexed or
|
|
stored will now be quietly ignored by Solr when Documents are added.
|
|
The example schema has a comment explaining how this can be used to
|
|
ignore any "unknown" fields.
|
|
(Will Johnson via hossman)
|
|
|
|
29. SOLR-227: If schema.xml defines multiple fieldTypes, fields, or
|
|
dynamicFields with the same name, a severe error will be logged rather
|
|
then quietly continuing. Depending on the <abortOnConfigurationError>
|
|
settings, this may halt the server. Likewise, if solrconfig.xml
|
|
defines multiple RequestHandlers with the same name it will also add
|
|
an error. (ryan)
|
|
|
|
30. SOLR-226: Added support for dynamic field as the destination of a
|
|
copyField using glob (*) replacement. (ryan)
|
|
|
|
31. SOLR-224: Adding a PhoneticFilterFactory that uses apache commons codec
|
|
language encoders to build phonetically similar tokens. This currently
|
|
supports: DoubleMetaphone, Metaphone, Soundex, and RefinedSoundex (ryan)
|
|
|
|
32. SOLR-199: new n-gram tokenizers available via NGramTokenizerFactory
|
|
and EdgeNGramTokenizerFactory. (Adam Hiatt via yonik)
|
|
|
|
33. SOLR-234: TrimFilter can update the Token's startOffset and endOffset
|
|
if updateOffsets="true". By default the Token offsets are unchanged.
|
|
(ryan)
|
|
|
|
34. SOLR-208: new example_rss.xsl and example_atom.xsl to provide more
|
|
examples for people about the Solr XML response format and how they
|
|
can transform it to suit different needs.
|
|
(Brian Whitman via hossman)
|
|
|
|
35. SOLR-249: Deprecated SolrException( int, ... ) constructors in favor
|
|
of constructors that takes an ErrorCode enum. This will ensure that
|
|
all SolrExceptions use a valid HTTP status code. (ryan)
|
|
|
|
36. SOLR-386: Abstracted SolrHighlighter and moved existing implementation
|
|
to DefaultSolrHighlighter. Adjusted SolrCore and solrconfig.xml so
|
|
that highlighter is configurable via a class attribute. Allows users
|
|
to use their own highlighter implementation. (Tricia Williams via klaas)
|
|
|
|
Changes in runtime behavior
|
|
1. Highlighting using DisMax will only pick up terms from the main
|
|
user query, not boost or filter queries (klaas).
|
|
|
|
2. SOLR-125: Change default of json.nl to flat, change so that
|
|
json.nl only affects items where order matters (facet constraint
|
|
listings). Fix JSON output bug for null values. Internal JAVA API:
|
|
change most uses of NamedList to SimpleOrderedMap. (yonik)
|
|
|
|
3. A new method "getSolrQueryParser" has been added to the IndexSchema
|
|
class for retrieving a new SolrQueryParser instance with all options
|
|
specified in the schema.xml's <solrQueryParser> block set. The
|
|
documentation for the SolrQueryParser constructor and it's use of
|
|
IndexSchema have also been clarified.
|
|
(Erik Hatcher and hossman)
|
|
|
|
4. DisMaxRequestHandler's bq, bf, qf, and pf parameters can now accept
|
|
multiple values (klaas).
|
|
|
|
5. Query are re-written before highlighting is performed. This enables
|
|
proper highlighting of prefix and wildcard queries (klaas).
|
|
|
|
6. A meaningful exception is raised when attempting to add a doc missing
|
|
a unique id if it is declared in the schema and allowDups=false.
|
|
(ryan via klaas)
|
|
|
|
7. SOLR-183: Exceptions with error code 400 are raised when
|
|
numeric argument parsing fails. RequiredSolrParams class added
|
|
to facilitate checking for parameters that must be present.
|
|
(Ryan McKinley, J.J. Larrea via yonik)
|
|
|
|
8. SOLR-179: By default, solr will abort after any severe initalization
|
|
errors. This behavior can be disabled by setting:
|
|
<abortOnConfigurationError>false</abortOnConfigurationError>
|
|
in solrconfig.xml (ryan)
|
|
|
|
9. The example solrconfig.xml maps /update to XmlUpdateRequestHandler using
|
|
the new request dispatcher (SOLR-104). This requires posted content to
|
|
have a valid contentType: curl -H 'Content-type:text/xml; charset=utf-8'
|
|
The response format matches that of /select and returns standard error
|
|
codes. To enable solr1.1 style /update, do not map "/update" to any
|
|
handler in solrconfig.xml (ryan)
|
|
|
|
10. SOLR-231: If a charset is not specified in the contentType,
|
|
ContentStream.getReader() will use UTF-8 encoding. (ryan)
|
|
|
|
11. SOLR-230: More options for post.jar to support stdin, xml on the
|
|
commandline, and defering commits. Tutorial modified to take
|
|
advantage of these options so there is no need for curl.
|
|
(hossman)
|
|
|
|
12. SOLR-128: Upgraded Jetty to the latest stable release 6.1.3 (ryan)
|
|
|
|
Optimizations
|
|
1. SOLR-114: HashDocSet specific implementations of union() and andNot()
|
|
for a 20x performance improvement for those set operations, and a new
|
|
hash algorithm speeds up exists() by 10% and intersectionSize() by 8%.
|
|
(yonik)
|
|
|
|
2. SOLR-115: Solr now uses BooleanQuery.clauses() instead of
|
|
BooleanQuery.getClauses() in any situation where there is no risk of
|
|
modifying the original query.
|
|
(hossman)
|
|
|
|
3. SOLR-221: Speed up sorted faceting on multivalued fields by ~60%
|
|
when the base set consists of a relatively large portion of the
|
|
index. (yonik)
|
|
|
|
4. SOLR-221: Added a facet.enum.cache.minDf parameter which avoids
|
|
using the filterCache for terms that match few documents, trading
|
|
decreased memory usage for increased query time. (yonik)
|
|
|
|
Bug Fixes
|
|
1. SOLR-87: Parsing of synonym files did not correctly handle escaped
|
|
whitespace such as \r\n\t\b\f. (yonik)
|
|
|
|
2. SOLR-92: DOMUtils.getText (used when parsing config files) did not
|
|
work properly with many DOM implementations when dealing with
|
|
"Attributes". (Ryan McKinley via hossman)
|
|
|
|
3. SOLR-9,SOLR-99: Tighten up sort specification error checking, throw
|
|
exceptions for missing sort specifications or a sort on a non-indexed
|
|
field. (Ryan McKinley via yonik)
|
|
|
|
4. SOLR-145: Fix for bug introduced in SOLR-104 where some Exceptions
|
|
were being ignored by all "out of the box" RequestHandlers. (hossman)
|
|
|
|
5. SOLR-166: JNDI solr.home code refactoring. SOLR-104 moved
|
|
some JNDI related code to the init method of a Servlet Filter -
|
|
according to the Servlet Spec, all Filter's should be initialized
|
|
prior to initializing any Servlets, but this is not the case in at
|
|
least one Servlet Container (Resin). This "bug fix" refactors
|
|
this JNDI code so that it should be executed the first time any
|
|
attempt is made to use the solr.home dir.
|
|
(Ryan McKinley via hossman)
|
|
|
|
6. SOLR-173: Bug fix to SolrDispatchFilter to reduce "too many open
|
|
files" problem was that SolrDispatchFilter was not closing requests
|
|
when finished. Also modified ResponseWriters to only fetch a Searcher
|
|
reference if necessary for writing out DocLists.
|
|
(Ryan McKinley via hossman)
|
|
|
|
7. SOLR-168: Fix display positioning of multiple tokens at the same
|
|
position in analysis.jsp (yonik)
|
|
|
|
8. SOLR-167: The SynonymFilter sometimes generated incorrect offsets when
|
|
multi token synonyms were mached in the source text. (yonik)
|
|
|
|
9. SOLR-188: bin scripts do not support non-default webapp names. Added "-U"
|
|
option to specify a full path to the update url, overriding the
|
|
"-h" (hostname), "-p" (port) and "-w" (webapp name) parameters.
|
|
(Jeff Rodenburg via billa)
|
|
|
|
10. SOLR-198: RunExecutableListener always waited for the process to
|
|
finish, even when wait="false" was set. (Koji Sekiguchi via yonik)
|
|
|
|
11. SOLR-207: Changed distribution scripts to remove recursive find
|
|
and avoid use of "find -maxdepth" on platforms where it is not
|
|
supported. (yonik)
|
|
|
|
12. SOLR-222: Changing writeLockTimeout in solrconfig.xml did not
|
|
change the effective timeout. (Koji Sekiguchi via yonik)
|
|
|
|
13. Changed the SOLR-104 RequestDispatcher so that /select?qt=xxx can not
|
|
access handlers that start with "/". This makes path based authentication
|
|
possible for path based request handlers. (ryan)
|
|
|
|
14. SOLR-214: Some servlet containers (including Tomcat and Resin) do not
|
|
obey the specified charset. Rather then letting the the container handle
|
|
it solr now uses the charset from the header contentType to decode posted
|
|
content. Using the contentType: "text/xml; charset=utf-8" will force
|
|
utf-8 encoding. If you do not specify a contentType, it will use the
|
|
platform default. (Koji Sekiguchi via ryan)
|
|
|
|
15. SOLR-241: Undefined system properties used in configuration files now
|
|
cause a clear message to be logged rather than an obscure exception thrown.
|
|
(Koji Sekiguchi via ehatcher)
|
|
|
|
Other Changes
|
|
1. Updated to Lucene 2.1
|
|
|
|
2. Updated to Lucene 2007-05-20_00-04-53
|
|
|
|
================== Release 1.1.0 ==================
|
|
|
|
Status
|
|
------
|
|
This is the first release since Solr joined the Incubator, and brings many
|
|
new features and performance optimizations including highlighting,
|
|
faceted browsing, and JSON/Python/Ruby response formats.
|
|
|
|
|
|
Upgrading from previous Solr versions
|
|
-------------------------------------
|
|
Older Apache Solr installations can be upgraded by replacing
|
|
the relevant war file with the new version. No changes to configuration
|
|
files are needed and the index format has not changed.
|
|
|
|
The default version of the Solr XML response syntax has been changed to 2.2.
|
|
Behavior can be preserved for those clients not explicitly specifying a
|
|
version by adding a default to the request handler in solrconfig.xml
|
|
|
|
By default, Solr will no longer use a searcher that has not fully warmed,
|
|
and requests will block in the meantime. To change back to the previous
|
|
behavior of using a cold searcher in the event there is no other
|
|
warm searcher, see the useColdSearcher config item in solrconfig.xml
|
|
|
|
The XML response format when adding multiple documents to the collection
|
|
in a single <add> command has changed to return a single <result>.
|
|
|
|
|
|
Detailed Change List
|
|
--------------------
|
|
|
|
New Features
|
|
1. added support for setting Lucene's positionIncrementGap
|
|
2. Admin: new statistics for SolrIndexSearcher
|
|
3. Admin: caches now show config params on stats page
|
|
3. max() function added to FunctionQuery suite
|
|
4. postOptimize hook, mirroring the functionallity of the postCommit hook,
|
|
but only called on an index optimize.
|
|
5. Ability to HTTP POST query requests to /select in addition to HTTP-GET
|
|
6. The default search field may now be overridden by requests to the
|
|
standard request handler using the df query parameter. (Erik Hatcher)
|
|
7. Added DisMaxRequestHandler and SolrPluginUtils. (Chris Hostetter)
|
|
8. Support for customizing the QueryResponseWriter per request
|
|
(Mike Baranczak / SOLR-16 / hossman)
|
|
9. Added KeywordTokenizerFactory (hossman)
|
|
10. copyField accepts dynamicfield-like names as the source.
|
|
(Darren Erik Vengroff via yonik, SOLR-21)
|
|
11. new DocSet.andNot(), DocSet.andNotSize() (yonik)
|
|
12. Ability to store term vectors for fields. (Mike Klaas via yonik, SOLR-23)
|
|
13. New abstract BufferedTokenStream for people who want to write
|
|
Tokenizers or TokenFilters that require arbitrary buffering of the
|
|
stream. (SOLR-11 / yonik, hossman)
|
|
14. New RemoveDuplicatesToken - useful in situations where
|
|
synonyms, stemming, or word-deliminater-ing produce identical tokens at
|
|
the same position. (SOLR-11 / yonik, hossman)
|
|
15. Added highlighting to SolrPluginUtils and implemented in StandardRequestHandler
|
|
and DisMaxRequestHandler (SOLR-24 / Mike Klaas via hossman,yonik)
|
|
16. SnowballPorterFilterFactory language is configurable via the "language"
|
|
attribute, with the default being "English". (Bertrand Delacretaz via yonik, SOLR-27)
|
|
17. ISOLatin1AccentFilterFactory, instantiates ISOLatin1AccentFilter to remove accents.
|
|
(Bertrand Delacretaz via yonik, SOLR-28)
|
|
18. JSON, Python, Ruby QueryResponseWriters: use wt="json", "python" or "ruby"
|
|
(yonik, SOLR-31)
|
|
19. Make web admin pages return UTF-8, change Content-type declaration to include a
|
|
space between the mime-type and charset (Philip Jacob, SOLR-35)
|
|
20. Made query parser default operator configurable via schema.xml:
|
|
<solrQueryParser defaultOperator="AND|OR"/>
|
|
The default operator remains "OR".
|
|
21. JAVA API: new version of SolrIndexSearcher.getDocListAndSet() which takes
|
|
flags (Greg Ludington via yonik, SOLR-39)
|
|
22. A HyphenatedWordsFilter, a text analysis filter used during indexing to rejoin
|
|
words that were hyphenated and split by a newline. (Boris Vitez via yonik, SOLR-41)
|
|
23. Added a CompressableField base class which allows fields of derived types to
|
|
be compressed using the compress=true setting. The field type also gains the
|
|
ability to specify a size threshold at which field data is compressed.
|
|
(klaas, SOLR-45)
|
|
24. Simple faceted search support for fields (enumerating terms)
|
|
and arbitrary queries added to both StandardRequestHandler and
|
|
DisMaxRequestHandler. (hossman, SOLR-44)
|
|
25. In addition to specifying default RequestHandler params in the
|
|
solrconfig.xml, support has been added for configuring values to be
|
|
appended to the multi-val request params, as well as for configuring
|
|
invariant params that can not overridden in the query. (hossman, SOLR-46)
|
|
26. Default operator for query parsing can now be specified with q.op=AND|OR
|
|
from the client request, overriding the schema value. (ehatcher)
|
|
27. New XSLTResponseWriter does server side XSLT processing of XML Response.
|
|
In the process, an init(NamedList) method was added to QueryResponseWriter
|
|
which works the same way as SolrRequestHandler.
|
|
(Bertrand Delacretaz / SOLR-49 / hossman)
|
|
28. json.wrf parameter adds a wrapper-function around the JSON response,
|
|
useful in AJAX with dynamic script tags for specifying a JavaScript
|
|
callback function. (Bertrand Delacretaz via yonik, SOLR-56)
|
|
29. autoCommit can be specified every so many documents added (klaas, SOLR-65)
|
|
30. ${solr.home}/lib directory can now be used for specifying "plugin" jars
|
|
(hossman, SOLR-68)
|
|
31. Support for "Date Math" relative "NOW" when specifying values of a
|
|
DateField in a query -- or when adding a document.
|
|
(hossman, SOLR-71)
|
|
32. useColdSearcher control in solrconfig.xml prevents the first searcher
|
|
from being used before it's done warming. This can help prevent
|
|
thrashing on startup when multiple requests hit a cold searcher.
|
|
The default is "false", preventing use before warm. (yonik, SOLR-77)
|
|
|
|
Changes in runtime behavior
|
|
1. classes reorganized into different packages, package names changed to Apache
|
|
2. force read of document stored fields in QuerySenderListener
|
|
3. Solr now looks in ./solr/conf for config, ./solr/data for data
|
|
configurable via solr.solr.home system property
|
|
4. Highlighter params changed to be prefixed with "hl."; allow fragmentsize
|
|
customization and per-field overrides on many options
|
|
(Andrew May via klaas, SOLR-37)
|
|
5. Default param values for DisMaxRequestHandler should now be specified
|
|
using a '<lst name="defaults">...</lst>' init param, for backwards
|
|
compatability all init prams will be used as defaults if an init param
|
|
with that name does not exist. (hossman, SOLR-43)
|
|
6. The DisMaxRequestHandler now supports multiple occurances of the "fq"
|
|
param. (hossman, SOLR-44)
|
|
7. FunctionQuery.explain now uses ComplexExplanation to provide more
|
|
accurate score explanations when composed in a BooleanQuery.
|
|
(hossman, SOLR-25)
|
|
8. Document update handling locking is much sparser, allowing performance gains
|
|
through multiple threads. Large commits also might be faster (klaas, SOLR-65)
|
|
9. Lazy field loading can be enabled via a solrconfig directive. This will be faster when
|
|
not all stored fields are needed from a document (klaas, SOLR-52)
|
|
10. Made admin JSPs return XML and transform them with new XSL stylesheets
|
|
(Otis Gospodnetic, SOLR-58)
|
|
11. If the "echoParams=explicit" request parameter is set, request parameters are copied
|
|
to the output. In an XML output, they appear in new <lst name="params"> list inside
|
|
the new <lst name="responseHeader"> element, which replaces the old <responseHeader>.
|
|
Adding a version=2.1 parameter to the request produces the old format, for backwards
|
|
compatibility (bdelacretaz and yonik, SOLR-59).
|
|
|
|
Optimizations
|
|
1. getDocListAndSet can now generate both a DocList and a DocSet from a
|
|
single lucene query.
|
|
2. BitDocSet.intersectionSize(HashDocSet) no longer generates an intermediate
|
|
set
|
|
3. OpenBitSet completed, replaces BitSet as the implementation for BitDocSet.
|
|
Iteration is faster, and BitDocSet.intersectionSize(BitDocSet) and unionSize
|
|
is between 3 and 4 times faster. (yonik, SOLR-15)
|
|
4. much faster unionSize when one of the sets is a HashDocSet: O(smaller_set_size)
|
|
5. Optimized getDocSet() for term queries resulting in a 36% speedup of facet.field
|
|
queries where DocSets aren't cached (for example, if the number of terms in the field
|
|
is larger than the filter cache.) (yonik)
|
|
6. Optimized facet.field faceting by as much as 500 times when the field has
|
|
a single token per document (not multiValued & not tokenized) by using the
|
|
Lucene FieldCache entry for that field to tally term counts. The first request
|
|
utilizing the FieldCache will take longer than subsequent ones.
|
|
|
|
Bug Fixes
|
|
1. Fixed delete-by-id for field types who's indexed form is different
|
|
from the printable form (mainly sortable numeric types).
|
|
2. Added escaping of attribute values in the XML response (Erik Hatcher)
|
|
3. Added empty extractTerms() to FunctionQuery to enable use in
|
|
a MultiSearcher (Yonik)
|
|
4. WordDelimiterFilter sometimes lost token positionIncrement information
|
|
5. Fix reverse sorting for fields were sortMissingFirst=true
|
|
(Rob Staveley, yonik)
|
|
6. Worked around a Jetty bug that caused invalid XML responses for fields
|
|
containing non ASCII chars. (Bertrand Delacretaz via yonik, SOLR-32)
|
|
7. WordDelimiterFilter can throw exceptions if configured with both
|
|
generate and catenate off. (Mike Klaas via yonik, SOLR-34)
|
|
8. Escape '>' in XML output (because ]]> is illegal in CharData)
|
|
9. field boosts weren't being applied and doc boosts were being applied to fields (klaas)
|
|
10. Multiple-doc update generates well-formed xml (klaas, SOLR-65)
|
|
11. Better parsing of pingQuery from solrconfig.xml (hossman, SOLR-70)
|
|
12. Fixed bug with "Distribution" page introduced when Versions were
|
|
added to "Info" page (hossman)
|
|
13. Fixed HTML escaping issues with user input to analysis.jsp and action.jsp
|
|
(hossman, SOLR-74)
|
|
|
|
Other Changes
|
|
1. Upgrade to Lucene 2.0 nightly build 2006-06-22, lucene SVN revision 416224,
|
|
http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?view=markup&pathrev=416224
|
|
2. Modified admin styles to improve display in Internet Explorer (Greg Ludington via billa, SOLR-6)
|
|
3. Upgrade to Lucene 2.0 nightly build 2006-07-15, lucene SVN revision 422302,
|
|
4. Included unique key field name/value (if available) in log message of add (billa, SOLR-18)
|
|
5. Updated to Lucene 2.0 nightly build 2006-09-07, SVN revision 462111
|
|
6. Added javascript to catch empty query in admin query forms (Tomislav Nakic-Alfirevic via billa, SOLR-48
|
|
7. blackslash escape * in ssh command used in snappuller for zsh compatibility, SOLR-63
|
|
8. check solr return code in admin scripts, SOLR-62
|
|
9. Updated to Lucene 2.0 nightly build 2006-11-15, SVN revision 475069
|
|
10. Removed src/apps containing the legacy "SolrTest" app (hossman, SOLR-3)
|
|
11. Simplified index.jsp and form.jsp, primarily by removing/hiding XML
|
|
specific params, and adding an option to pick the output type. (hossman)
|
|
12. Added new numeric build property "specversion" to allow clean
|
|
MANIFEST.MF files (hossman)
|
|
13. Added Solr/Lucene versions to "Info" page (hossman)
|
|
14. Explicitly set mime-type of .xsl files in web.xml to
|
|
application/xslt+xml (hossman)
|
|
15. Config parsing should now work useing DOM Level 2 parsers -- Solr
|
|
previously relied on getTextContent which is a DOM Level 3 addition
|
|
(Alexander Saar via hossman, SOLR-78)
|
|
|
|
2006/01/17 Solr open sourced, moves to Apache Incubator
|