mirror of https://github.com/apache/lucene.git
152 lines
4.9 KiB
Plaintext
152 lines
4.9 KiB
Plaintext
Lucene contrib change Log
|
|
|
|
======================= Trunk (not yet released) =======================
|
|
|
|
Changes in runtime behavior
|
|
|
|
1. LUCENE-1505: Local lucene now uses org.apache.lucene.util.NumericUtils for all
|
|
number conversion. You'll need to fully re-index any previously created indexes.
|
|
This isn't a break in back-compatibility because local Lucene has not yet
|
|
been released. (Mike McCandless)
|
|
|
|
API Changes
|
|
|
|
(None)
|
|
|
|
Bug fixes
|
|
|
|
1. LUCENE-1423: InstantiatedTermEnum#skipTo(Term) throws ArrayIndexOutOfBounds on empty index.
|
|
(Karl Wettin)
|
|
|
|
2. LUCENE-1462: InstantiatedIndexWriter did not reset pre analyzed TokenStreams the
|
|
same way IndexWriter does. Parts of InstantiatedIndex was not Serializable.
|
|
(Karl Wettin)
|
|
|
|
3. LUCENE-1510: InstantiatedIndexReader#norms methods throws NullPointerException on empty index.
|
|
(Karl Wettin, Robert Newson)
|
|
|
|
4. LUCENE-1514: ShingleMatrixFilter#next(Token) easily throws a StackOverflowException
|
|
due to recursive invocation. (Karl Wettin)
|
|
|
|
5. LUCENE-1548: Fix distance normalization in LevenshteinDistance to
|
|
not produce negative distances (Thomas Morton via Mike McCandless)
|
|
|
|
6. LUCENE-1490: Fix latin1 conversion of HALFWIDTH_AND_FULLWIDTH_FORMS
|
|
characters to only apply to the correct subset (Daniel Cheng via
|
|
Mike McCandless)
|
|
|
|
7. LUCENE-1576: Fix BrazilianAnalyzer to downcase tokens after
|
|
StandardTokenizer so that stop words with mixed case are filtered
|
|
out. (Rafael Cunha de Almeida, Douglas Campos via Mike McCandless)
|
|
|
|
8. LUCENE-1491: EdgeNGramTokenFilter no longer stops on tokens shorter than minimum n-gram size.
|
|
(Todd Teak via Otis Gospodnetic)
|
|
|
|
9. LUCENE-1752: Missing highlights when terms were repeated in separate, nested, boolean or
|
|
disjunction queries. (Koji Sekiguchi, Mark Miller)
|
|
|
|
New features
|
|
|
|
1. LUCENE-1531: Added support for BoostingTermQuery to XML query parser. (Karl Wettin)
|
|
|
|
2. LUCENE-1435: Added contrib/collation, a CollationKeyFilter
|
|
allowing you to convert tokens into CollationKeys encoded usign
|
|
IndexableBinaryStringTools. This allows for faster RangQuery when
|
|
a field needs to use a custom Collator. (Steven Rowe via Mike
|
|
McCandless)
|
|
|
|
3. LUCENE-1591: EnWikiDocMaker, LineDocMaker, WriteLineDoc can now
|
|
read/write bz2 using Apache commons compress library. This means
|
|
you can download the .bz2 export from http://wikipedia.org and
|
|
immediately index it. (Shai Erera via Mike McCandless)
|
|
|
|
4. LUCENE-1629: Add SmartChineseAnalyzer to contrib/analyzers. It
|
|
improves on CJKAnalyzer and ChineseAnalyzer by handling Chinese
|
|
sentences properly. SmartChineseAnalyzer uses a Hidden Markov
|
|
Model to tokenize Chinese words in a more intelligent way.
|
|
(Xiaoping Gao via Mike McCandless)
|
|
|
|
5. LUCENE-1676: Added DelimitedPayloadTokenFilter class for automatically adding payloads "in-stream" (Grant Ingersoll)
|
|
|
|
6. LUCENE-1578: Support for loading unoptimized readers to the
|
|
constructor of InstantiatedIndex. (Karl Wettin)
|
|
|
|
7. LUCENE-1704: Allow specifying the Tidy configuration file when
|
|
parsing HTML docs with contrib/ant. (Keith Sprochi via Mike
|
|
McCandless)
|
|
|
|
8. LUCENE-1522: Added contrib/fast-vector-highlighter, a new alternative
|
|
highlighter. (Koji Sekiguchi via Mike McCandless)
|
|
|
|
9. LUCENE-1740: Added "analyzer" command to Lucli, enabling changing
|
|
the analyzer from the default StandardAnalyzer. (Bernd Fondermann
|
|
via Mike McCandless)
|
|
|
|
10. LUCENE-1272: Add get/setBoost to MoreLikeThis. (Jonathan
|
|
Leibiusky via Mike McCandless)
|
|
|
|
Optimizations
|
|
|
|
1. LUCENE-1643: Re-use the collation key (RawCollationKey) for
|
|
better performance, in ICUCollationKeyFilter. (Robert Muir via
|
|
Mike McCandless)
|
|
|
|
Documentation
|
|
|
|
(None)
|
|
|
|
Build
|
|
|
|
(None)
|
|
|
|
Test Cases
|
|
|
|
(None)
|
|
|
|
======================= Release 2.4.0 2008-10-06 =======================
|
|
|
|
Changes in runtime behavior
|
|
|
|
(None)
|
|
|
|
API Changes
|
|
|
|
1.
|
|
|
|
(None)
|
|
|
|
Bug fixes
|
|
|
|
1. LUCENE-1312: Added full support for InstantiatedIndexReader#getFieldNames()
|
|
and tests that assert that deleted documents behaves as they should (they did).
|
|
(Jason Rutherglen, Karl Wettin)
|
|
|
|
2. LUCENE-1318: InstantiatedIndexReader.norms(String, b[], int) didn't treat
|
|
the array offset right. (Jason Rutherglen via Karl Wettin)
|
|
|
|
New features
|
|
|
|
1. LUCENE-1320: ShingleMatrixFilter, multidimensional shingle token filter. (Karl Wettin)
|
|
|
|
2. LUCENE-1142: Updated Snowball package, org.tartarus distribution revision 500.
|
|
Introducing Hungarian, Turkish and Romanian support, updated older stemmers
|
|
and optimized (reflectionless) SnowballFilter.
|
|
IMPORTANT NOTICE ON BACKWARDS COMPATIBILITY: an index created using the 2.3.2 (or older)
|
|
might not be compatible with these updated classes as some algorithms have changed.
|
|
(Karl Wettin)
|
|
|
|
3. LUCENE-1016: TermVectorAccessor, transparent vector space access via stored vectors
|
|
or by resolving the inverted index. (Karl Wettin)
|
|
|
|
Documentation
|
|
|
|
(None)
|
|
|
|
Build
|
|
|
|
(None)
|
|
|
|
Test Cases
|
|
|
|
(None)
|