mirror of https://github.com/apache/lucene.git
SOLR-3650: checkpoint - merged in CHANGES.txt entries from contrib/analysis-extras contrib/langid contrib/clustering
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1367377 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
179a0c87bd
commit
6961d9f589
|
@ -554,6 +554,11 @@ New Features
|
||||||
* SOLR-3542: Add WeightedFragListBuilder for FVH and set it to default fragListBuilder
|
* SOLR-3542: Add WeightedFragListBuilder for FVH and set it to default fragListBuilder
|
||||||
in example solrconfig.xml. (Sebastian Lutze, koji)
|
in example solrconfig.xml. (Sebastian Lutze, koji)
|
||||||
|
|
||||||
|
* SOLR-2396: Add ICUCollationField to contrib/analysis-extras, which is much
|
||||||
|
more efficient than the Solr 3.x ICUCollationKeyFilterFactory, and also
|
||||||
|
supports Locale-sensitive range queries. (rmuir)
|
||||||
|
|
||||||
|
|
||||||
Optimizations
|
Optimizations
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
|
@ -701,6 +706,10 @@ Bug Fixes
|
||||||
the hashCode implementation of {!bbox} and {!geofilt} queries.
|
the hashCode implementation of {!bbox} and {!geofilt} queries.
|
||||||
(hossman)
|
(hossman)
|
||||||
|
|
||||||
|
* SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories
|
||||||
|
are respected now (Stanislaw Osinski, Dawid Weiss)
|
||||||
|
|
||||||
|
|
||||||
Other Changes
|
Other Changes
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
|
@ -886,6 +895,9 @@ Bug Fixes:
|
||||||
|
|
||||||
* SOLR-3477: SOLR does not start up when no cores are defined (Tomás Fernández Löbbe via tommaso)
|
* SOLR-3477: SOLR does not start up when no cores are defined (Tomás Fernández Löbbe via tommaso)
|
||||||
|
|
||||||
|
* SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories
|
||||||
|
are respected now (Stanislaw Osinski, Dawid Weiss)
|
||||||
|
|
||||||
================== 3.6.0 ==================
|
================== 3.6.0 ==================
|
||||||
More information about this release, including any errata related to the
|
More information about this release, including any errata related to the
|
||||||
release notes, upgrade instructions, or other changes may be found online at:
|
release notes, upgrade instructions, or other changes may be found online at:
|
||||||
|
@ -1028,6 +1040,11 @@ New Features
|
||||||
exception from being thrown by the default parser if "q" is missing. (yonik)
|
exception from being thrown by the default parser if "q" is missing. (yonik)
|
||||||
SOLR-435: if q is "" then it's also acceptable. (dsmiley, hoss)
|
SOLR-435: if q is "" then it's also acceptable. (dsmiley, hoss)
|
||||||
|
|
||||||
|
* SOLR-2919: Added parametric tailoring options to ICUCollationKeyFilterFactory.
|
||||||
|
These can be used to customize range query/sort behavior, for example to
|
||||||
|
support numeric collation, ignore punctuation/whitespace, ignore accents but
|
||||||
|
not case, control whether upper/lowercase values are sorted first, etc. (rmuir)
|
||||||
|
|
||||||
Optimizations
|
Optimizations
|
||||||
----------------------
|
----------------------
|
||||||
* SOLR-1931: Speedup for LukeRequestHandler and admin/schema browser. New parameter
|
* SOLR-1931: Speedup for LukeRequestHandler and admin/schema browser. New parameter
|
||||||
|
@ -1189,6 +1206,35 @@ Bug Fixes
|
||||||
* SOLR-3316: Distributed grouping failed when rows parameter was set to 0 and
|
* SOLR-3316: Distributed grouping failed when rows parameter was set to 0 and
|
||||||
sometimes returned a wrong hit count as matches. (Cody Young, Martijn van Groningen)
|
sometimes returned a wrong hit count as matches. (Cody Young, Martijn van Groningen)
|
||||||
|
|
||||||
|
* SOLR-3107: contrib/langid: When using the LangDetect implementation of
|
||||||
|
langid, set the random seed to 0, so that the same document is detected as
|
||||||
|
the same language with the same probability every time.
|
||||||
|
(Christian Moen via rmuir)
|
||||||
|
|
||||||
|
* SOLR-2937: Configuring the number of contextual snippets used for
|
||||||
|
search results clustering. The hl.snippets parameter is now respected
|
||||||
|
by the clustering plugin, can be overridden by carrot.summarySnippets
|
||||||
|
if needed (Stanislaw Osinski).
|
||||||
|
|
||||||
|
* SOLR-2938: Clustering on multiple fields. The carrot.title and
|
||||||
|
carrot.snippet can now take comma- or space-separated lists of
|
||||||
|
field names to cluster (Stanislaw Osinski).
|
||||||
|
|
||||||
|
* SOLR-2939: Clustering of multilingual search results. The document's
|
||||||
|
language field be passed in the carrot.lang parameter, the carrot.lcmap
|
||||||
|
parameter enables mapping of language codes to ISO 639 (Stanislaw Osinski).
|
||||||
|
|
||||||
|
* SOLR-2940: Passing values for custom Carrot2 fields to Clustering component.
|
||||||
|
The custom field mapping are defined using the carrot.custom parameter
|
||||||
|
(Stanislaw Osinski).
|
||||||
|
|
||||||
|
* SOLR-2941: NullPointerException on clustering component initialization
|
||||||
|
when schema does not have a unique key field (Stanislaw Osinski).
|
||||||
|
|
||||||
|
* SOLR-2942: ClassCastException when passing non-textual fields to
|
||||||
|
clustering component (Stanislaw Osinski).
|
||||||
|
|
||||||
|
|
||||||
Other Changes
|
Other Changes
|
||||||
----------------------
|
----------------------
|
||||||
* SOLR-2922: Upgrade commons-io and commons-lang to 2.1 and 2.6, respectively. (koji)
|
* SOLR-2922: Upgrade commons-io and commons-lang to 2.1 and 2.6, respectively. (koji)
|
||||||
|
@ -1294,6 +1340,9 @@ New Features
|
||||||
request param that can be used to delete all but the most recent N backups.
|
request param that can be used to delete all but the most recent N backups.
|
||||||
(James Dyer via hossman)
|
(James Dyer via hossman)
|
||||||
|
|
||||||
|
* SOLR-2839: Add alternative implementation to contrib/langid supporting 53
|
||||||
|
languages, based on http://code.google.com/p/language-detection/ (rmuir)
|
||||||
|
|
||||||
Optimizations
|
Optimizations
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
|
@ -1516,6 +1565,12 @@ Bug Fixes
|
||||||
failed due to sort by function changes introduced in SOLR-1297
|
failed due to sort by function changes introduced in SOLR-1297
|
||||||
(Mitsu Hadeishi, hossman)
|
(Mitsu Hadeishi, hossman)
|
||||||
|
|
||||||
|
* SOLR-2706: contrib/clustering: The carrot.lexicalResourcesDir parameter
|
||||||
|
now works with absolute directories (Stanislaw Osinski)
|
||||||
|
|
||||||
|
* SOLR-2692: contrib/clustering: Typo in param name fixed: "carrot.fragzise"
|
||||||
|
changed to "carrot.fragSize" (Stanislaw Osinski).
|
||||||
|
|
||||||
Other Changes
|
Other Changes
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
|
@ -1671,6 +1726,12 @@ New Features
|
||||||
Explanation objects in it's responses instead of
|
Explanation objects in it's responses instead of
|
||||||
Explanation.toString (hossman)
|
Explanation.toString (hossman)
|
||||||
|
|
||||||
|
* SOLR-2448: Search results clustering updates: bisecting k-means
|
||||||
|
clustering algorithm added, loading of Carrot2 stop words from
|
||||||
|
<solr.home>/conf/carrot2 (SOLR-2449), using Solr's stopwords.txt
|
||||||
|
for clustering (SOLR-2450), output of cluster scores (SOLR-2505)
|
||||||
|
(Stanislaw Osinski, Dawid Weiss).
|
||||||
|
|
||||||
Optimizations
|
Optimizations
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
|
@ -2014,6 +2075,26 @@ New Features
|
||||||
|
|
||||||
* SOLR-1057: Add PathHierarchyTokenizerFactory. (ryan, koji)
|
* SOLR-1057: Add PathHierarchyTokenizerFactory. (ryan, koji)
|
||||||
|
|
||||||
|
* SOLR-1804: Re-enabled clustering component on trunk, updated to latest
|
||||||
|
version of Carrot2. No more LGPL run-time dependencies. This release of
|
||||||
|
C2 also does not have a specific Lucene dependency.
|
||||||
|
(Stanislaw Osinski, gsingers)
|
||||||
|
|
||||||
|
* SOLR-2282: Add distributed search support for search result clustering.
|
||||||
|
(Brad Giaccio, Dawid Weiss, Stanislaw Osinski, rmuir, koji)
|
||||||
|
|
||||||
|
* SOLR-2210: Add icu-based tokenizer and filters to contrib/analysis-extras (rmuir)
|
||||||
|
|
||||||
|
* SOLR-1336: Add SmartChinese (word segmentation for Simplified Chinese)
|
||||||
|
tokenizer and filters to contrib/analysis-extras (rmuir)
|
||||||
|
|
||||||
|
* SOLR-2211,LUCENE-2763: Added UAX29URLEmailTokenizerFactory, which implements
|
||||||
|
UAX#29, a unicode algorithm with good results for most languages, as well as
|
||||||
|
URL and E-mail tokenization according to the relevant RFCs.
|
||||||
|
(Tom Burton-West via rmuir)
|
||||||
|
|
||||||
|
* SOLR-2237: Added StempelPolishStemFilterFactory to contrib/analysis-extras (rmuir)
|
||||||
|
|
||||||
Optimizations
|
Optimizations
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
|
@ -2035,6 +2116,10 @@ Optimizations
|
||||||
|
|
||||||
* SOLR-2046: add common functions to scripts-util. (koji)
|
* SOLR-2046: add common functions to scripts-util. (koji)
|
||||||
|
|
||||||
|
* SOLR-1684: Switch clustering component to use the
|
||||||
|
SolrIndexSearcher.doc(int, Set<String>) method b/c it can use the document
|
||||||
|
cache (gsingers)
|
||||||
|
|
||||||
Bug Fixes
|
Bug Fixes
|
||||||
----------------------
|
----------------------
|
||||||
* SOLR-1769: Solr 1.4 Replication - Repeater throwing NullPointerException (Jörgen Rydenius via noble)
|
* SOLR-1769: Solr 1.4 Replication - Repeater throwing NullPointerException (Jörgen Rydenius via noble)
|
||||||
|
@ -2289,6 +2374,9 @@ Bug Fixes
|
||||||
* SOLR-2192: StreamingUpdateSolrServer.blockUntilFinished was not
|
* SOLR-2192: StreamingUpdateSolrServer.blockUntilFinished was not
|
||||||
thread safe and could throw an exception. (yonik)
|
thread safe and could throw an exception. (yonik)
|
||||||
|
|
||||||
|
* SOLR-1692: Fix bug in clustering component relating to carrot.produceSummary
|
||||||
|
option (gsingers)
|
||||||
|
|
||||||
Other Changes
|
Other Changes
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
|
|
|
@ -1,63 +0,0 @@
|
||||||
Apache Solr - Analysis Extras
|
|
||||||
Release Notes
|
|
||||||
|
|
||||||
Introduction
|
|
||||||
------------
|
|
||||||
The analysis-extras plugin provides additional analyzers that rely
|
|
||||||
upon large dependencies/dictionaries.
|
|
||||||
|
|
||||||
It includes integration with ICU for multilingual support, and
|
|
||||||
analyzers for Chinese and Polish.
|
|
||||||
|
|
||||||
|
|
||||||
$Id$
|
|
||||||
================== 5.0.0 ==============
|
|
||||||
|
|
||||||
(No changes)
|
|
||||||
|
|
||||||
================== 4.0.0-ALPHA ==============
|
|
||||||
|
|
||||||
* SOLR-2396: Add ICUCollationField, which is much more efficient than
|
|
||||||
the Solr 3.x ICUCollationKeyFilterFactory, and also supports
|
|
||||||
Locale-sensitive range queries. (rmuir)
|
|
||||||
|
|
||||||
================== 3.6.1 ==================
|
|
||||||
|
|
||||||
(No Changes)
|
|
||||||
|
|
||||||
================== 3.6.0 ==================
|
|
||||||
|
|
||||||
* SOLR-2919: Added parametric tailoring options to ICUCollationKeyFilterFactory.
|
|
||||||
These can be used to customize range query/sort behavior, for example to
|
|
||||||
support numeric collation, ignore punctuation/whitespace, ignore accents but
|
|
||||||
not case, control whether upper/lowercase values are sorted first, etc. (rmuir)
|
|
||||||
|
|
||||||
================== 3.5.0 ==================
|
|
||||||
|
|
||||||
(No Changes)
|
|
||||||
|
|
||||||
================== 3.4.0 ==================
|
|
||||||
|
|
||||||
(No Changes)
|
|
||||||
|
|
||||||
================== 3.3.0 ==================
|
|
||||||
|
|
||||||
(No Changes)
|
|
||||||
|
|
||||||
================== 3.2.0 ==================
|
|
||||||
|
|
||||||
(No Changes)
|
|
||||||
|
|
||||||
================== 3.1.0 ==================
|
|
||||||
|
|
||||||
* SOLR-2210: Add icu-based tokenizer and filters to contrib/analysis-extras (rmuir)
|
|
||||||
|
|
||||||
* SOLR-1336: Add SmartChinese (word segmentation for Simplified Chinese)
|
|
||||||
tokenizer and filters to contrib/analysis-extras (rmuir)
|
|
||||||
|
|
||||||
* SOLR-2211,LUCENE-2763: Added UAX29URLEmailTokenizerFactory, which implements
|
|
||||||
UAX#29, a unicode algorithm with good results for most languages, as well as
|
|
||||||
URL and E-mail tokenization according to the relevant RFCs.
|
|
||||||
(Tom Burton-West via rmuir)
|
|
||||||
|
|
||||||
* SOLR-2237: Added StempelPolishStemFilterFactory to contrib/analysis-extras (rmuir)
|
|
|
@ -1,87 +0,0 @@
|
||||||
Apache Solr Clustering Implementation
|
|
||||||
|
|
||||||
Intro:
|
|
||||||
|
|
||||||
See http://wiki.apache.org/solr/ClusteringComponent
|
|
||||||
|
|
||||||
CHANGES
|
|
||||||
|
|
||||||
$Id$
|
|
||||||
================== Release 5.0.0 ==============
|
|
||||||
|
|
||||||
(No changes)
|
|
||||||
|
|
||||||
================== Release 4.0.0-ALPHA ==============
|
|
||||||
|
|
||||||
* SOLR-3470: Bug fix: custom Carrot2 tokenizer and stemmer factories are
|
|
||||||
respected now (Stanislaw Osinski, Dawid Weiss)
|
|
||||||
|
|
||||||
================== Release 3.6.1 ==================
|
|
||||||
|
|
||||||
* SOLR-3470: Bug fix: custom Carrot2 tokenizer and stemmer factories are
|
|
||||||
respected now (Stanislaw Osinski, Dawid Weiss)
|
|
||||||
|
|
||||||
================== Release 3.6.0 ==================
|
|
||||||
|
|
||||||
* SOLR-2937: Configuring the number of contextual snippets used for
|
|
||||||
search results clustering. The hl.snippets parameter is now respected
|
|
||||||
by the clustering plugin, can be overridden by carrot.summarySnippets
|
|
||||||
if needed (Stanislaw Osinski).
|
|
||||||
|
|
||||||
* SOLR-2938: Clustering on multiple fields. The carrot.title and
|
|
||||||
carrot.snippet can now take comma- or space-separated lists of
|
|
||||||
field names to cluster (Stanislaw Osinski).
|
|
||||||
|
|
||||||
* SOLR-2939: Clustering of multilingual search results. The document's
|
|
||||||
language field be passed in the carrot.lang parameter, the carrot.lcmap
|
|
||||||
parameter enables mapping of language codes to ISO 639 (Stanislaw Osinski).
|
|
||||||
|
|
||||||
* SOLR-2940: Passing values for custom Carrot2 fields. The custom field
|
|
||||||
mapping are defined using the carrot.custom parameter (Stanislaw Osinski).
|
|
||||||
|
|
||||||
* SOLR-2941: NullPointerException on clustering component initialization
|
|
||||||
when schema does not have a unique key field (Stanislaw Osinski).
|
|
||||||
|
|
||||||
* SOLR-2942: ClassCastException when passing non-textual fields for
|
|
||||||
clustering (Stanislaw Osinski).
|
|
||||||
|
|
||||||
================== Release 3.5.0 ==================
|
|
||||||
|
|
||||||
(No Changes)
|
|
||||||
|
|
||||||
================== Release 3.4.0 ==================
|
|
||||||
|
|
||||||
* SOLR-2706: The carrot.lexicalResourcesDir parameter now works
|
|
||||||
with absolute directories (Stanislaw Osinski)
|
|
||||||
|
|
||||||
* SOLR-2692: Typo in param name fixed: "carrot.fragzise" changed to
|
|
||||||
"carrot.fragSize" (Stanislaw Osinski).
|
|
||||||
|
|
||||||
================== Release 3.3.0 ==================
|
|
||||||
|
|
||||||
(No Changes)
|
|
||||||
|
|
||||||
================== Release 3.2.0 ==================
|
|
||||||
|
|
||||||
* SOLR-2448: Search results clustering updates: bisecting k-means
|
|
||||||
clustering algorithm added, loading of Carrot2 stop words from
|
|
||||||
<solr.home>/conf/carrot2 (SOLR-2449), using Solr's stopwords.txt
|
|
||||||
for clustering (SOLR-2450), output of cluster scores (SOLR-2505)
|
|
||||||
(Stanislaw Osinski, Dawid Weiss).
|
|
||||||
|
|
||||||
================== Release 3.1.0 ==================
|
|
||||||
|
|
||||||
* SOLR-1684: Switch to use the SolrIndexSearcher.doc(int, Set<String>) method b/c it can use the document cache (gsingers)
|
|
||||||
|
|
||||||
* SOLR-1692: Fix bug relating to carrot.produceSummary option (gsingers)
|
|
||||||
|
|
||||||
* SOLR-1804: Re-enabled clustering on trunk, updated to latest version of Carrot2. No more LGPL run-time dependencies.
|
|
||||||
This release of C2 also does not have a specific Lucene dependency. (Stanislaw Osinski, gsingers)
|
|
||||||
|
|
||||||
* SOLR-2282: Add distributed search support for search result clustering.
|
|
||||||
(Brad Giaccio, Dawid Weiss, Stanislaw Osinski, rmuir, koji)
|
|
||||||
|
|
||||||
================== Release 1.4.0 ==================
|
|
||||||
|
|
||||||
Solr Clustering will be released for the first time in Solr 1.4. See http://wiki.apache.org/solr/ClusteringComponent
|
|
||||||
for details on using.
|
|
|
@ -1,36 +0,0 @@
|
||||||
Apache Solr Language Identifier
|
|
||||||
Release Notes
|
|
||||||
|
|
||||||
This file describes changes to the SolrTika Language Identifier (contrib/langid) module.
|
|
||||||
See http://wiki.apache.org/solr/LanguageDetection for details
|
|
||||||
|
|
||||||
|
|
||||||
$Id$
|
|
||||||
|
|
||||||
================== Release 5.0.0 ==================
|
|
||||||
|
|
||||||
(No changes)
|
|
||||||
|
|
||||||
================== Release 4.0.0-ALPHA ==================
|
|
||||||
|
|
||||||
(No changes)
|
|
||||||
|
|
||||||
================== Release 3.6.1 ==================
|
|
||||||
|
|
||||||
(No Changes)
|
|
||||||
|
|
||||||
================== Release 3.6.0 ==================
|
|
||||||
|
|
||||||
* SOLR-3107: When using the LangDetect implementation of langid, set the random
|
|
||||||
seed to 0, so that the same document is detected as the same language with
|
|
||||||
the same probability every time. (Christian Moen via rmuir)
|
|
||||||
|
|
||||||
================== Release 3.5.0 ==================
|
|
||||||
|
|
||||||
Initial release. See README.txt.
|
|
||||||
|
|
||||||
* SOLR-1979: New contrib "langid". Adds language identification capabilities as an
|
|
||||||
Update Processor, using Tika's LanguageIdentifier (janhoy, Tommaso Teofili, gsingers)
|
|
||||||
|
|
||||||
* SOLR-2839: Add alternative implementation supporting 53 languages,
|
|
||||||
based on http://code.google.com/p/language-detection/ (rmuir)
|
|
Loading…
Reference in New Issue