SOLR-3650: checkpoint - merged in CHANGES.txt entries from contrib/analysis-extras contrib/langid contrib/clustering

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1367377 13f79535-47bb-0310-9956-ffa450edef68
2012-07-31 00:34:01 +00:00 · 2012-07-31 00:34:01 +00:00 · 6961d9f589
parent 179a0c87bd
commit 6961d9f589
4 changed files with 88 additions and 186 deletions
--- a/solr/CHANGES.txt
+++ b/solr/CHANGES.txt
@ -554,6 +554,11 @@ New Features
 * SOLR-3542: Add WeightedFragListBuilder for FVH and set it to default fragListBuilder
  in example solrconfig.xml. (Sebastian Lutze, koji)

+* SOLR-2396: Add ICUCollationField to contrib/analysis-extras, which is much 
+  more efficient than the Solr 3.x ICUCollationKeyFilterFactory, and also 
+  supports Locale-sensitive range queries.  (rmuir)
+
+
 Optimizations
 ----------------------

@ -701,6 +706,10 @@ Bug Fixes
  the hashCode implementation of {!bbox} and {!geofilt} queries.
  (hossman)

+* SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories
+  are respected now (Stanislaw Osinski, Dawid Weiss)
+
+
 Other Changes
 ----------------------

@ -886,6 +895,9 @@ Bug Fixes:

 * SOLR-3477: SOLR does not start up when no cores are defined (Tomás Fernández Löbbe via tommaso)

+* SOLR-3470: contrib/clustering: custom Carrot2 tokenizer and stemmer factories
+  are respected now (Stanislaw Osinski, Dawid Weiss)
+
 ==================  3.6.0  ==================
 More information about this release, including any errata related to the 
 release notes, upgrade instructions, or other changes may be found online at:
@ -1028,6 +1040,11 @@ New Features
  exception from being thrown by the default parser if "q" is missing. (yonik)
  SOLR-435: if q is "" then it's also acceptable. (dsmiley, hoss)

+* SOLR-2919: Added parametric tailoring options to ICUCollationKeyFilterFactory.
+  These can be used to customize range query/sort behavior, for example to
+  support numeric collation, ignore punctuation/whitespace, ignore accents but
+  not case, control whether upper/lowercase values are sorted first, etc.  (rmuir)
+
 Optimizations
 ----------------------
 * SOLR-1931: Speedup for LukeRequestHandler and admin/schema browser. New parameter
@ -1189,6 +1206,35 @@ Bug Fixes
 * SOLR-3316: Distributed grouping failed when rows parameter was set to 0 and 
  sometimes returned a wrong hit count as matches. (Cody Young, Martijn van Groningen)

+* SOLR-3107: contrib/langid: When using the LangDetect implementation of 
+  langid, set the random seed to 0, so that the same document is detected as 
+  the same language with the same probability every time.  
+  (Christian Moen via rmuir)
+
+* SOLR-2937: Configuring the number of contextual snippets used for 
+  search results clustering. The hl.snippets parameter is now respected
+  by the clustering plugin, can be overridden by carrot.summarySnippets
+  if needed (Stanislaw Osinski).
+
+* SOLR-2938: Clustering on multiple fields. The carrot.title and 
+  carrot.snippet can now take comma- or space-separated lists of
+  field names to cluster (Stanislaw Osinski).
+
+* SOLR-2939: Clustering of multilingual search results. The document's
+  language field be passed in the carrot.lang parameter, the carrot.lcmap
+  parameter enables mapping of language codes to ISO 639 (Stanislaw Osinski).
+
+* SOLR-2940: Passing values for custom Carrot2 fields to Clustering component. 
+  The custom field mapping are defined using the carrot.custom parameter 
+  (Stanislaw Osinski).
+
+* SOLR-2941: NullPointerException on clustering component initialization 
+  when schema does not have a unique key field (Stanislaw Osinski).
+
+* SOLR-2942: ClassCastException when passing non-textual fields to  
+  clustering component (Stanislaw Osinski).
+
+
 Other Changes
 ----------------------
 * SOLR-2922: Upgrade commons-io and commons-lang to 2.1 and 2.6, respectively. (koji)
@ -1294,6 +1340,9 @@ New Features
  request param that can be used to delete all but the most recent N backups.
  (James Dyer via hossman)

+* SOLR-2839: Add alternative implementation to contrib/langid supporting 53 
+  languages, based on http://code.google.com/p/language-detection/ (rmuir)
+
 Optimizations
 ----------------------

@ -1516,6 +1565,12 @@ Bug Fixes
  failed due to sort by function changes introduced in SOLR-1297
  (Mitsu Hadeishi, hossman)

+* SOLR-2706: contrib/clustering: The carrot.lexicalResourcesDir parameter 
+  now works with absolute directories (Stanislaw Osinski)
+  
+* SOLR-2692: contrib/clustering: Typo in param name fixed: "carrot.fragzise" 
+  changed to "carrot.fragSize" (Stanislaw Osinski).
+
 Other Changes
 ----------------------

@ -1671,6 +1726,12 @@ New Features
  Explanation objects in it's responses instead of
  Explanation.toString  (hossman)

+* SOLR-2448: Search results clustering updates: bisecting k-means
+  clustering algorithm added, loading of Carrot2 stop words from
+  <solr.home>/conf/carrot2 (SOLR-2449), using Solr's stopwords.txt
+  for clustering (SOLR-2450), output of cluster scores (SOLR-2505)
+  (Stanislaw Osinski, Dawid Weiss).
+
 Optimizations
 ----------------------

@ -2014,6 +2075,26 @@ New Features

 * SOLR-1057: Add PathHierarchyTokenizerFactory. (ryan, koji)

+* SOLR-1804: Re-enabled clustering component on trunk, updated to latest 
+  version of Carrot2.  No more LGPL run-time dependencies.  This release of 
+  C2 also does not have a specific Lucene dependency.  
+  (Stanislaw Osinski, gsingers)
+
+* SOLR-2282: Add distributed search support for search result clustering.
+  (Brad Giaccio, Dawid Weiss, Stanislaw Osinski, rmuir, koji)
+
+* SOLR-2210: Add icu-based tokenizer and filters to contrib/analysis-extras (rmuir)
+
+* SOLR-1336: Add SmartChinese (word segmentation for Simplified Chinese) 
+  tokenizer and filters to contrib/analysis-extras (rmuir)
+
+* SOLR-2211,LUCENE-2763: Added UAX29URLEmailTokenizerFactory, which implements
+  UAX#29, a unicode algorithm with good results for most languages, as well as
+  URL and E-mail tokenization according to the relevant RFCs.
+  (Tom Burton-West via rmuir)
+
+* SOLR-2237: Added StempelPolishStemFilterFactory to contrib/analysis-extras (rmuir)
+
 Optimizations
 ----------------------

@ -2035,6 +2116,10 @@ Optimizations

 * SOLR-2046: add common functions to scripts-util. (koji)

+* SOLR-1684: Switch clustering component to use the 
+  SolrIndexSearcher.doc(int, Set<String>) method b/c it can use the document 
+  cache (gsingers)
+
 Bug Fixes
 ----------------------
 * SOLR-1769: Solr 1.4 Replication - Repeater throwing NullPointerException (Jörgen Rydenius via noble)
@ -2289,6 +2374,9 @@ Bug Fixes
 * SOLR-2192: StreamingUpdateSolrServer.blockUntilFinished was not
  thread safe and could throw an exception. (yonik)

+* SOLR-1692: Fix bug in clustering component relating to carrot.produceSummary 
+  option (gsingers)
+
 Other Changes
 ----------------------

--- a/solr/contrib/analysis-extras/CHANGES.txt
+++ b/solr/contrib/analysis-extras/CHANGES.txt
@ -1,63 +0,0 @@
-                    Apache Solr - Analysis Extras
-                            Release Notes
-
-Introduction
------------
-The analysis-extras plugin provides additional analyzers that rely
-upon large dependencies/dictionaries.
-
-It includes integration with ICU for multilingual support, and 
-analyzers for Chinese and Polish.
-
-
-$Id$
-==================  5.0.0 ==============
-
-  (No changes)
-
-==================  4.0.0-ALPHA ==============
-
-* SOLR-2396: Add ICUCollationField, which is much more efficient than
-  the Solr 3.x ICUCollationKeyFilterFactory, and also supports
-  Locale-sensitive range queries.  (rmuir)
-
-==================  3.6.1 ==================
-
-(No Changes)
-
-==================  3.6.0 ==================
-
-* SOLR-2919: Added parametric tailoring options to ICUCollationKeyFilterFactory.
-  These can be used to customize range query/sort behavior, for example to
-  support numeric collation, ignore punctuation/whitespace, ignore accents but
-  not case, control whether upper/lowercase values are sorted first, etc.  (rmuir)
-
-==================  3.5.0 ==================
-
-(No Changes)
-
-==================  3.4.0 ==================
-
-(No Changes)
-
-==================  3.3.0 ==================
-
-(No Changes)
-
-==================  3.2.0 ==================
-
-(No Changes)
-
-==================  3.1.0 ==================
-
-* SOLR-2210: Add icu-based tokenizer and filters to contrib/analysis-extras (rmuir)
-
-* SOLR-1336: Add SmartChinese (word segmentation for Simplified Chinese) 
-  tokenizer and filters to contrib/analysis-extras (rmuir)
-
-* SOLR-2211,LUCENE-2763: Added UAX29URLEmailTokenizerFactory, which implements
-  UAX#29, a unicode algorithm with good results for most languages, as well as
-  URL and E-mail tokenization according to the relevant RFCs.
-  (Tom Burton-West via rmuir)
-
-* SOLR-2237: Added StempelPolishStemFilterFactory to contrib/analysis-extras (rmuir)
--- a/solr/contrib/clustering/CHANGES.txt
+++ b/solr/contrib/clustering/CHANGES.txt
@ -1,87 +0,0 @@
-Apache Solr Clustering Implementation
-
-Intro:
-
-See http://wiki.apache.org/solr/ClusteringComponent
-
-CHANGES
-
-$Id$
-================== Release 5.0.0 ==============
-
- (No changes)
-
-================== Release 4.0.0-ALPHA ==============
-
-* SOLR-3470: Bug fix: custom Carrot2 tokenizer and stemmer factories are 
-  respected now (Stanislaw Osinski, Dawid Weiss)
-
-================== Release 3.6.1 ==================
-
-* SOLR-3470: Bug fix: custom Carrot2 tokenizer and stemmer factories are 
-  respected now (Stanislaw Osinski, Dawid Weiss)
-
-================== Release 3.6.0 ==================
-
-* SOLR-2937: Configuring the number of contextual snippets used for 
-  search results clustering. The hl.snippets parameter is now respected
-  by the clustering plugin, can be overridden by carrot.summarySnippets
-  if needed (Stanislaw Osinski).
-
-* SOLR-2938: Clustering on multiple fields. The carrot.title and 
-  carrot.snippet can now take comma- or space-separated lists of
-  field names to cluster (Stanislaw Osinski).
-
-* SOLR-2939: Clustering of multilingual search results. The document's
-  language field be passed in the carrot.lang parameter, the carrot.lcmap
-  parameter enables mapping of language codes to ISO 639 (Stanislaw Osinski).
-
-* SOLR-2940: Passing values for custom Carrot2 fields. The custom field
-  mapping are defined using the carrot.custom parameter (Stanislaw Osinski).
-
-* SOLR-2941: NullPointerException on clustering component initialization 
-  when schema does not have a unique key field (Stanislaw Osinski).
-
-* SOLR-2942: ClassCastException when passing non-textual fields for 
-  clustering (Stanislaw Osinski).
-
-================== Release 3.5.0 ==================
-
-(No Changes)
-
-================== Release 3.4.0 ==================
-
-* SOLR-2706: The carrot.lexicalResourcesDir parameter now works 
-  with absolute directories (Stanislaw Osinski)
-  
-* SOLR-2692: Typo in param name fixed: "carrot.fragzise" changed to 
-  "carrot.fragSize" (Stanislaw Osinski).
-
-================== Release 3.3.0 ==================
-
-(No Changes)
-
-================== Release 3.2.0 ==================
-
-* SOLR-2448: Search results clustering updates: bisecting k-means
-  clustering algorithm added, loading of Carrot2 stop words from
-  <solr.home>/conf/carrot2 (SOLR-2449), using Solr's stopwords.txt
-  for clustering (SOLR-2450), output of cluster scores (SOLR-2505)
-  (Stanislaw Osinski, Dawid Weiss).
-
-================== Release 3.1.0 ==================
-
-* SOLR-1684: Switch to use the SolrIndexSearcher.doc(int, Set<String>) method b/c it can use the document cache (gsingers)
-
-* SOLR-1692: Fix bug relating to carrot.produceSummary option (gsingers)
-
-* SOLR-1804: Re-enabled clustering on trunk, updated to latest version of Carrot2.  No more LGPL run-time dependencies.
-  This release of C2 also does not have a specific Lucene dependency.  (Stanislaw Osinski, gsingers)
-
-* SOLR-2282: Add distributed search support for search result clustering.
-  (Brad Giaccio, Dawid Weiss, Stanislaw Osinski, rmuir, koji)
-
-================== Release 1.4.0 ==================
-
-Solr Clustering will be released for the first time in Solr 1.4.  See http://wiki.apache.org/solr/ClusteringComponent
- for details on using.
--- a/solr/contrib/langid/CHANGES.txt
+++ b/solr/contrib/langid/CHANGES.txt
@ -1,36 +0,0 @@
-Apache Solr Language Identifier
-                            Release Notes
-
-This file describes changes to the SolrTika Language Identifier (contrib/langid) module.
-See http://wiki.apache.org/solr/LanguageDetection for details
-
-
-$Id$
-
-================== Release 5.0.0 ==================
-
-(No changes)
-
-================== Release 4.0.0-ALPHA ==================
-
-(No changes)
-
-================== Release 3.6.1 ==================
-
-(No Changes)
-
-================== Release 3.6.0 ==================
-
-* SOLR-3107: When using the LangDetect implementation of langid, set the random
-  seed to 0, so that the same document is detected as the same language with
-  the same probability every time.  (Christian Moen via rmuir)
-
-================== Release 3.5.0 ==================
-
-Initial release.  See README.txt.
-
-* SOLR-1979: New contrib "langid". Adds language identification capabilities as an 
-  Update Processor, using Tika's LanguageIdentifier (janhoy, Tommaso Teofili, gsingers)
-
-* SOLR-2839: Add alternative implementation supporting 53 languages, 
-  based on http://code.google.com/p/language-detection/ (rmuir)