cleanups to 4.x CHANGES

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1231552 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Robert Muir 2012-01-14 18:24:48 +00:00
parent fc57aa6a04
commit fbd34b4390
2 changed files with 7 additions and 38 deletions

View File

@ -12,19 +12,6 @@ API Changes
* LUCENE-2413: Removed the AnalyzerUtil in common/miscellaneous. (Robert Muir) * LUCENE-2413: Removed the AnalyzerUtil in common/miscellaneous. (Robert Muir)
* LUCENE-2167,LUCENE-2699,LUCENE-2763,LUCENE-2847: StandardTokenizer/Analyzer
in common/standard/ now implement the Word Break rules from the Unicode 6.0.0
Text Segmentation algorithm (UAX#29), covering the full range of Unicode code
points, including values from U+FFFF to U+10FFFF
ClassicTokenizer/Analyzer retains the old (pre-Lucene 3.1) StandardTokenizer/
Analyzer implementation and behavior. Only the Unicode Basic Multilingual
Plane (code points from U+0000 to U+FFFF) is covered.
UAX29URLEmailTokenizer tokenizes URLs and E-mail addresses according to the
relevant RFCs, in addition to implementing the UAX#29 Word Break rules.
(Steven Rowe, Robert Muir, Uwe Schindler)
* LUCENE-1370: Added ShingleFilter option to output unigrams if no shingles * LUCENE-1370: Added ShingleFilter option to output unigrams if no shingles
can be generated. (Chris Harris via Steven Rowe) can be generated. (Chris Harris via Steven Rowe)
@ -42,18 +29,13 @@ API Changes
since they prevent reuse. Both Analyzers should be configured at instantiation. since they prevent reuse. Both Analyzers should be configured at instantiation.
(Chris Male) (Chris Male)
* LUCENE-3305: Added SegmentingTokenizerBase, which breaks text into sentences
with BreakIterator and allows subclasses to decompose sentences into words, or
use the sentence boundary information for other reasons (e.g. attribute/position increment)
(Robert Muir)
New Features New Features
* LUCENE-2341: A new analyzer/ filter: Morfologik - a dictionary-driven lemmatizer * LUCENE-2341: A new analyzer/ filter: Morfologik - a dictionary-driven lemmatizer
(accurate stemmer) for Polish (includes morphosyntactic annotations). (accurate stemmer) for Polish (includes morphosyntactic annotations).
(Michał Dybizbański, Dawid Weiss) (Michał Dybizbański, Dawid Weiss)
* LUCENE-2413: Consolidated Solr analysis components into common. * LUCENE-2413: Consolidated Lucene/Solr analysis components into common.
New features from Solr now available to Lucene users include: New features from Solr now available to Lucene users include:
- o.a.l.analysis.commongrams: Constructs n-grams for frequently occurring terms - o.a.l.analysis.commongrams: Constructs n-grams for frequently occurring terms
and phrases. and phrases.
@ -78,7 +60,7 @@ New Features
- o.a.l.analysis.phonetic: Package for phonetic search, containing various - o.a.l.analysis.phonetic: Package for phonetic search, containing various
phonetic encoders such as Double Metaphone. phonetic encoders such as Double Metaphone.
* LUCENE-2413: Consolidated all Lucene analyzers into common. Some existing analysis components changed packages:
- o.a.l.analysis.KeywordAnalyzer -> o.a.l.analysis.core.KeywordAnalyzer - o.a.l.analysis.KeywordAnalyzer -> o.a.l.analysis.core.KeywordAnalyzer
- o.a.l.analysis.KeywordTokenizer -> o.a.l.analysis.core.KeywordTokenizer - o.a.l.analysis.KeywordTokenizer -> o.a.l.analysis.core.KeywordTokenizer
- o.a.l.analysis.LetterTokenizer -> o.a.l.analysis.core.LetterTokenizer - o.a.l.analysis.LetterTokenizer -> o.a.l.analysis.core.LetterTokenizer
@ -108,19 +90,6 @@ New Features
- o.a.l.analysis.CharTokenizer -> o.a.l.analysis.util.CharTokenizer - o.a.l.analysis.CharTokenizer -> o.a.l.analysis.util.CharTokenizer
- o.a.l.util.CharacterUtils -> o.a.l.analysis.util.CharacterUtils - o.a.l.util.CharacterUtils -> o.a.l.analysis.util.CharacterUtils
* SOLR-1057: Add PathHierarchyTokenizer that represents file path hierarchies as synonyms of All analyzers in contrib/analyzers and contrib/icu were moved to the
/something, /something/something, /something/something/else. (Ryan McKinley, Koji Sekiguchi)
* LUCENE-3414: Added HunspellStemFilter which uses a provided pure Java implementation of the
Hunspell algorithm. (Chris Male)
* LUCENE-3305: Added Kuromoji morphological analyzer for Japanese.
(Christian Moen, Masaru Hasegawa, Simon Willnauer, Uwe Schindler, Mike McCandless, Robert Muir)
Build
* LUCENE-2413: All analyzers in contrib/analyzers and contrib/icu were moved to the
analysis module. The 'smartcn' and 'stempel' components now depend on 'common'. analysis module. The 'smartcn' and 'stempel' components now depend on 'common'.
(Robert Muir) (Chris Male, Robert Muir)
* LUCENE-3376: Moved ReusableAnalyzerBase into lucene core. (Chris Male)

View File

@ -17,11 +17,11 @@ $Id$
the Solr 3.x ICUCollationKeyFilterFactory, and also supports the Solr 3.x ICUCollationKeyFilterFactory, and also supports
Locale-sensitive range queries. (rmuir) Locale-sensitive range queries. (rmuir)
================== 3.6.0 ==================
* LUCENE-3305: Added Kuromoji morphological analyzer for Japanese. * LUCENE-3305: Added Kuromoji morphological analyzer for Japanese.
(Christian Moen, Masaru Hasegawa via Robert Muir) (Christian Moen, Masaru Hasegawa via Robert Muir)
================== 3.6.0 ==================
* SOLR-2919: Added parametric tailoring options to ICUCollationKeyFilterFactory. * SOLR-2919: Added parametric tailoring options to ICUCollationKeyFilterFactory.
These can be used to customize range query/sort behavior, for example to These can be used to customize range query/sort behavior, for example to
support numeric collation, ignore punctuation/whitespace, ignore accents but support numeric collation, ignore punctuation/whitespace, ignore accents but