lucene/contrib/CHANGES.txt

128 lines
3.8 KiB
Plaintext
Raw Normal View History

Lucene contrib change Log
======================= Trunk (not yet released) =======================
Changes in runtime behavior
(None)
API Changes
(None)
Bug fixes
1. LUCENE-1423: InstantiatedTermEnum#skipTo(Term) throws ArrayIndexOutOfBounds on empty index.
(Karl Wettin)
2. LUCENE-1462: InstantiatedIndexWriter did not reset pre analyzed TokenStreams the
same way IndexWriter does. Parts of InstantiatedIndex was not Serializable.
(Karl Wettin)
3. LUCENE-1510: InstantiatedIndexReader#norms methods throws NullPointerException on empty index.
(Karl Wettin, Robert Newson)
4. LUCENE-1514: ShingleMatrixFilter#next(Token) easily throws a StackOverflowException
due to recursive invocation. (Karl Wettin)
5. LUCENE-1548: Fix distance normalization in LevenshteinDistance to
not produce negative distances (Thomas Morton via Mike McCandless)
6. LUCENE-1490: Fix latin1 conversion of HALFWIDTH_AND_FULLWIDTH_FORMS
characters to only apply to the correct subset (Daniel Cheng via
Mike McCandless)
7. LUCENE-1576: Fix BrazilianAnalyzer to downcase tokens after
StandardTokenizer so that stop words with mixed case are filtered
out. (Rafael Cunha de Almeida, Douglas Campos via Mike McCandless)
New features
1. LUCENE-1470: Added TrieRangeQuery, a much faster implementation of
RangeQuery at the expense of added space (additional indexed
tokens) consumed in the index. (Uwe Schindler via Mike McCandless)
2. LUCENE-1531: Added support for BoostingTermQuery to XML query parser. (Karl Wettin)
3. LUCENE-1435: Added contrib/collation, a CollationKeyFilter
allowing you to convert tokens into CollationKeys encoded usign
IndexableBinaryStringTools. This allows for faster RangQuery when
a field needs to use a custom Collator. (Steven Rowe via Mike
McCandless)
4. LUCENE-1591: EnWikiDocMaker, LineDocMaker, WriteLineDoc can now
read/write bz2 using Apache commons compress library. This means
you can download the .bz2 export from http://wikipedia.org and
immediately index it. (Shai Erera via Mike McCandless)
5. LUCENE-1629: Add SmartChineseAnalyzer to contrib/analyzers. It
improves on CJKAnalyzer and ChineseAnalyzer by handling Chinese
sentences properly. SmartChineseAnalyzer uses a Hidden Markov
Model to tokenize Chinese words in a more intelligent way.
(Xiaoping Gao via Mike McCandless)
Optimizations
1. LUCENE-1643: Re-use the collation key (RawCollationKey) for
better performance, in ICUCollationKeyFilter. (Robert Muir via
Mike McCandless)
Documentation
(None)
Build
(None)
Test Cases
(None)
======================= Release 2.4.0 2008-10-06 =======================
Changes in runtime behavior
(None)
API Changes
1.
(None)
Bug fixes
1. LUCENE-1312: Added full support for InstantiatedIndexReader#getFieldNames()
and tests that assert that deleted documents behaves as they should (they did).
(Jason Rutherglen, Karl Wettin)
2. LUCENE-1318: InstantiatedIndexReader.norms(String, b[], int) didn't treat
the array offset right. (Jason Rutherglen via Karl Wettin)
New features
1. LUCENE-1320: ShingleMatrixFilter, multidimensional shingle token filter. (Karl Wettin)
2. LUCENE-1142: Updated Snowball package, org.tartarus distribution revision 500.
Introducing Hungarian, Turkish and Romanian support, updated older stemmers
and optimized (reflectionless) SnowballFilter.
IMPORTANT NOTICE ON BACKWARDS COMPATIBILITY: an index created using the 2.3.2 (or older)
might not be compatible with these updated classes as some algorithms have changed.
(Karl Wettin)
3. LUCENE-1016: TermVectorAccessor, transparent vector space access via stored vectors
or by resolving the inverted index. (Karl Wettin)
Documentation
(None)
Build
(None)
Test Cases
(None)