SOLR-3650: checkpoint - merged in CHANGES.txt entries from contrib/analysis-extras contrib/langid contrib/clustering

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1367377 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Chris M. Hostetter 2012-07-31 00:34:01 +00:00
parent 179a0c87bd
commit 6961d9f589
4 changed files with 88 additions and 186 deletions

View File

@ -554,6 +554,11 @@ New Features
* SOLR-3542: Add WeightedFragListBuilder for FVH and set it to default fragListBuilder * SOLR-3542: Add WeightedFragListBuilder for FVH and set it to default fragListBuilder
in example solrconfig.xml. (Sebastian Lutze, koji) in example solrconfig.xml. (Sebastian Lutze, koji)
* SOLR-2396: Add ICUCollationField to contrib/analysis-extras, which is much
more efficient than the Solr 3.x ICUCollationKeyFilterFactory, and also
supports Locale-sensitive range queries. (rmuir)
Optimizations Optimizations
---------------------- ----------------------
@ -701,6 +706,10 @@ Bug Fixes
the hashCode implementation of {!bbox} and {!geofilt} queries. the hashCode implementation of {!bbox} and {!geofilt} queries.
(hossman) (hossman)
* SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories
are respected now (Stanislaw Osinski, Dawid Weiss)
Other Changes Other Changes
---------------------- ----------------------
@ -886,6 +895,9 @@ Bug Fixes:
* SOLR-3477: SOLR does not start up when no cores are defined (Tomás Fernández Löbbe via tommaso) * SOLR-3477: SOLR does not start up when no cores are defined (Tomás Fernández Löbbe via tommaso)
* SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories
are respected now (Stanislaw Osinski, Dawid Weiss)
================== 3.6.0 ================== ================== 3.6.0 ==================
More information about this release, including any errata related to the More information about this release, including any errata related to the
release notes, upgrade instructions, or other changes may be found online at: release notes, upgrade instructions, or other changes may be found online at:
@ -1028,6 +1040,11 @@ New Features
exception from being thrown by the default parser if "q" is missing. (yonik) exception from being thrown by the default parser if "q" is missing. (yonik)
SOLR-435: if q is "" then it's also acceptable. (dsmiley, hoss) SOLR-435: if q is "" then it's also acceptable. (dsmiley, hoss)
* SOLR-2919: Added parametric tailoring options to ICUCollationKeyFilterFactory.
These can be used to customize range query/sort behavior, for example to
support numeric collation, ignore punctuation/whitespace, ignore accents but
not case, control whether upper/lowercase values are sorted first, etc. (rmuir)
Optimizations Optimizations
---------------------- ----------------------
* SOLR-1931: Speedup for LukeRequestHandler and admin/schema browser. New parameter * SOLR-1931: Speedup for LukeRequestHandler and admin/schema browser. New parameter
@ -1189,6 +1206,35 @@ Bug Fixes
* SOLR-3316: Distributed grouping failed when rows parameter was set to 0 and * SOLR-3316: Distributed grouping failed when rows parameter was set to 0 and
sometimes returned a wrong hit count as matches. (Cody Young, Martijn van Groningen) sometimes returned a wrong hit count as matches. (Cody Young, Martijn van Groningen)
* SOLR-3107: contrib/langid: When using the LangDetect implementation of
langid, set the random seed to 0, so that the same document is detected as
the same language with the same probability every time.
(Christian Moen via rmuir)
* SOLR-2937: Configuring the number of contextual snippets used for
search results clustering. The hl.snippets parameter is now respected
by the clustering plugin, can be overridden by carrot.summarySnippets
if needed (Stanislaw Osinski).
* SOLR-2938: Clustering on multiple fields. The carrot.title and
carrot.snippet can now take comma- or space-separated lists of
field names to cluster (Stanislaw Osinski).
* SOLR-2939: Clustering of multilingual search results. The document's
language field be passed in the carrot.lang parameter, the carrot.lcmap
parameter enables mapping of language codes to ISO 639 (Stanislaw Osinski).
* SOLR-2940: Passing values for custom Carrot2 fields to Clustering component.
The custom field mapping are defined using the carrot.custom parameter
(Stanislaw Osinski).
* SOLR-2941: NullPointerException on clustering component initialization
when schema does not have a unique key field (Stanislaw Osinski).
* SOLR-2942: ClassCastException when passing non-textual fields to
clustering component (Stanislaw Osinski).
Other Changes Other Changes
---------------------- ----------------------
* SOLR-2922: Upgrade commons-io and commons-lang to 2.1 and 2.6, respectively. (koji) * SOLR-2922: Upgrade commons-io and commons-lang to 2.1 and 2.6, respectively. (koji)
@ -1294,6 +1340,9 @@ New Features
request param that can be used to delete all but the most recent N backups. request param that can be used to delete all but the most recent N backups.
(James Dyer via hossman) (James Dyer via hossman)
* SOLR-2839: Add alternative implementation to contrib/langid supporting 53
languages, based on http://code.google.com/p/language-detection/ (rmuir)
Optimizations Optimizations
---------------------- ----------------------
@ -1516,6 +1565,12 @@ Bug Fixes
failed due to sort by function changes introduced in SOLR-1297 failed due to sort by function changes introduced in SOLR-1297
(Mitsu Hadeishi, hossman) (Mitsu Hadeishi, hossman)
* SOLR-2706: contrib/clustering: The carrot.lexicalResourcesDir parameter
now works with absolute directories (Stanislaw Osinski)
* SOLR-2692: contrib/clustering: Typo in param name fixed: "carrot.fragzise"
changed to "carrot.fragSize" (Stanislaw Osinski).
Other Changes Other Changes
---------------------- ----------------------
@ -1671,6 +1726,12 @@ New Features
Explanation objects in it's responses instead of Explanation objects in it's responses instead of
Explanation.toString (hossman) Explanation.toString (hossman)
* SOLR-2448: Search results clustering updates: bisecting k-means
clustering algorithm added, loading of Carrot2 stop words from
<solr.home>/conf/carrot2 (SOLR-2449), using Solr's stopwords.txt
for clustering (SOLR-2450), output of cluster scores (SOLR-2505)
(Stanislaw Osinski, Dawid Weiss).
Optimizations Optimizations
---------------------- ----------------------
@ -2014,6 +2075,26 @@ New Features
* SOLR-1057: Add PathHierarchyTokenizerFactory. (ryan, koji) * SOLR-1057: Add PathHierarchyTokenizerFactory. (ryan, koji)
* SOLR-1804: Re-enabled clustering component on trunk, updated to latest
version of Carrot2. No more LGPL run-time dependencies. This release of
C2 also does not have a specific Lucene dependency.
(Stanislaw Osinski, gsingers)
* SOLR-2282: Add distributed search support for search result clustering.
(Brad Giaccio, Dawid Weiss, Stanislaw Osinski, rmuir, koji)
* SOLR-2210: Add icu-based tokenizer and filters to contrib/analysis-extras (rmuir)
* SOLR-1336: Add SmartChinese (word segmentation for Simplified Chinese)
tokenizer and filters to contrib/analysis-extras (rmuir)
* SOLR-2211,LUCENE-2763: Added UAX29URLEmailTokenizerFactory, which implements
UAX#29, a unicode algorithm with good results for most languages, as well as
URL and E-mail tokenization according to the relevant RFCs.
(Tom Burton-West via rmuir)
* SOLR-2237: Added StempelPolishStemFilterFactory to contrib/analysis-extras (rmuir)
Optimizations Optimizations
---------------------- ----------------------
@ -2035,6 +2116,10 @@ Optimizations
* SOLR-2046: add common functions to scripts-util. (koji) * SOLR-2046: add common functions to scripts-util. (koji)
* SOLR-1684: Switch clustering component to use the
SolrIndexSearcher.doc(int, Set<String>) method b/c it can use the document
cache (gsingers)
Bug Fixes Bug Fixes
---------------------- ----------------------
* SOLR-1769: Solr 1.4 Replication - Repeater throwing NullPointerException (Jörgen Rydenius via noble) * SOLR-1769: Solr 1.4 Replication - Repeater throwing NullPointerException (Jörgen Rydenius via noble)
@ -2289,6 +2374,9 @@ Bug Fixes
* SOLR-2192: StreamingUpdateSolrServer.blockUntilFinished was not * SOLR-2192: StreamingUpdateSolrServer.blockUntilFinished was not
thread safe and could throw an exception. (yonik) thread safe and could throw an exception. (yonik)
* SOLR-1692: Fix bug in clustering component relating to carrot.produceSummary
option (gsingers)
Other Changes Other Changes
---------------------- ----------------------

View File

@ -1,63 +0,0 @@
Apache Solr - Analysis Extras
Release Notes
Introduction
------------
The analysis-extras plugin provides additional analyzers that rely
upon large dependencies/dictionaries.
It includes integration with ICU for multilingual support, and
analyzers for Chinese and Polish.
$Id$
================== 5.0.0 ==============
(No changes)
================== 4.0.0-ALPHA ==============
* SOLR-2396: Add ICUCollationField, which is much more efficient than
the Solr 3.x ICUCollationKeyFilterFactory, and also supports
Locale-sensitive range queries. (rmuir)
================== 3.6.1 ==================
(No Changes)
================== 3.6.0 ==================
* SOLR-2919: Added parametric tailoring options to ICUCollationKeyFilterFactory.
These can be used to customize range query/sort behavior, for example to
support numeric collation, ignore punctuation/whitespace, ignore accents but
not case, control whether upper/lowercase values are sorted first, etc. (rmuir)
================== 3.5.0 ==================
(No Changes)
================== 3.4.0 ==================
(No Changes)
================== 3.3.0 ==================
(No Changes)
================== 3.2.0 ==================
(No Changes)
================== 3.1.0 ==================
* SOLR-2210: Add icu-based tokenizer and filters to contrib/analysis-extras (rmuir)
* SOLR-1336: Add SmartChinese (word segmentation for Simplified Chinese)
tokenizer and filters to contrib/analysis-extras (rmuir)
* SOLR-2211,LUCENE-2763: Added UAX29URLEmailTokenizerFactory, which implements
UAX#29, a unicode algorithm with good results for most languages, as well as
URL and E-mail tokenization according to the relevant RFCs.
(Tom Burton-West via rmuir)
* SOLR-2237: Added StempelPolishStemFilterFactory to contrib/analysis-extras (rmuir)

View File

@ -1,87 +0,0 @@
Apache Solr Clustering Implementation
Intro:
See http://wiki.apache.org/solr/ClusteringComponent
CHANGES
$Id$
================== Release 5.0.0 ==============
(No changes)
================== Release 4.0.0-ALPHA ==============
* SOLR-3470: Bug fix: custom Carrot2 tokenizer and stemmer factories are
respected now (Stanislaw Osinski, Dawid Weiss)
================== Release 3.6.1 ==================
* SOLR-3470: Bug fix: custom Carrot2 tokenizer and stemmer factories are
respected now (Stanislaw Osinski, Dawid Weiss)
================== Release 3.6.0 ==================
* SOLR-2937: Configuring the number of contextual snippets used for
search results clustering. The hl.snippets parameter is now respected
by the clustering plugin, can be overridden by carrot.summarySnippets
if needed (Stanislaw Osinski).
* SOLR-2938: Clustering on multiple fields. The carrot.title and
carrot.snippet can now take comma- or space-separated lists of
field names to cluster (Stanislaw Osinski).
* SOLR-2939: Clustering of multilingual search results. The document's
language field be passed in the carrot.lang parameter, the carrot.lcmap
parameter enables mapping of language codes to ISO 639 (Stanislaw Osinski).
* SOLR-2940: Passing values for custom Carrot2 fields. The custom field
mapping are defined using the carrot.custom parameter (Stanislaw Osinski).
* SOLR-2941: NullPointerException on clustering component initialization
when schema does not have a unique key field (Stanislaw Osinski).
* SOLR-2942: ClassCastException when passing non-textual fields for
clustering (Stanislaw Osinski).
================== Release 3.5.0 ==================
(No Changes)
================== Release 3.4.0 ==================
* SOLR-2706: The carrot.lexicalResourcesDir parameter now works
with absolute directories (Stanislaw Osinski)
* SOLR-2692: Typo in param name fixed: "carrot.fragzise" changed to
"carrot.fragSize" (Stanislaw Osinski).
================== Release 3.3.0 ==================
(No Changes)
================== Release 3.2.0 ==================
* SOLR-2448: Search results clustering updates: bisecting k-means
clustering algorithm added, loading of Carrot2 stop words from
<solr.home>/conf/carrot2 (SOLR-2449), using Solr's stopwords.txt
for clustering (SOLR-2450), output of cluster scores (SOLR-2505)
(Stanislaw Osinski, Dawid Weiss).
================== Release 3.1.0 ==================
* SOLR-1684: Switch to use the SolrIndexSearcher.doc(int, Set<String>) method b/c it can use the document cache (gsingers)
* SOLR-1692: Fix bug relating to carrot.produceSummary option (gsingers)
* SOLR-1804: Re-enabled clustering on trunk, updated to latest version of Carrot2. No more LGPL run-time dependencies.
This release of C2 also does not have a specific Lucene dependency. (Stanislaw Osinski, gsingers)
* SOLR-2282: Add distributed search support for search result clustering.
(Brad Giaccio, Dawid Weiss, Stanislaw Osinski, rmuir, koji)
================== Release 1.4.0 ==================
Solr Clustering will be released for the first time in Solr 1.4. See http://wiki.apache.org/solr/ClusteringComponent
for details on using.

View File

@ -1,36 +0,0 @@
Apache Solr Language Identifier
Release Notes
This file describes changes to the SolrTika Language Identifier (contrib/langid) module.
See http://wiki.apache.org/solr/LanguageDetection for details
$Id$
================== Release 5.0.0 ==================
(No changes)
================== Release 4.0.0-ALPHA ==================
(No changes)
================== Release 3.6.1 ==================
(No Changes)
================== Release 3.6.0 ==================
* SOLR-3107: When using the LangDetect implementation of langid, set the random
seed to 0, so that the same document is detected as the same language with
the same probability every time. (Christian Moen via rmuir)
================== Release 3.5.0 ==================
Initial release. See README.txt.
* SOLR-1979: New contrib "langid". Adds language identification capabilities as an
Update Processor, using Tika's LanguageIdentifier (janhoy, Tommaso Teofili, gsingers)
* SOLR-2839: Add alternative implementation supporting 53 languages,
based on http://code.google.com/p/language-detection/ (rmuir)