lucene/contrib/CHANGES.txt

Lucene contrib change Log

======================= Trunk (not yet released) =======================

Changes in runtime behavior

 1. LUCENE-1505: Local lucene now uses org.apache.lucene.util.NumericUtils for all
    number conversion.  You'll need to fully re-index any previously created indexes.
    This isn't a break in back-compatibility because local Lucene has not yet
    been released.  (Mike McCandless)

API Changes

 (None)

Bug fixes

 1. LUCENE-1423: InstantiatedTermEnum#skipTo(Term) throws ArrayIndexOutOfBounds on empty index.
    (Karl Wettin)

 2. LUCENE-1462: InstantiatedIndexWriter did not reset pre analyzed TokenStreams the
    same way IndexWriter does. Parts of InstantiatedIndex was not Serializable.
    (Karl Wettin)

 3. LUCENE-1510: InstantiatedIndexReader#norms methods throws NullPointerException on empty index.
    (Karl Wettin, Robert Newson)

 4. LUCENE-1514: ShingleMatrixFilter#next(Token) easily throws a StackOverflowException
    due to recursive invocation. (Karl Wettin)

 5. LUCENE-1548: Fix distance normalization in LevenshteinDistance to
    not produce negative distances (Thomas Morton via Mike McCandless)

 6. LUCENE-1490: Fix latin1 conversion of HALFWIDTH_AND_FULLWIDTH_FORMS
    characters to only apply to the correct subset (Daniel Cheng via
    Mike McCandless)

 7. LUCENE-1576: Fix BrazilianAnalyzer to downcase tokens after
    StandardTokenizer so that stop words with mixed case are filtered
    out.  (Rafael Cunha de Almeida, Douglas Campos via Mike McCandless)

 8. LUCENE-1491: EdgeNGramTokenFilter no longer stops on tokens shorter than minimum n-gram size.
    (Todd Teak via Otis Gospodnetic)

 9. LUCENE-1752: Missing highlights when terms were repeated in separate, nested, boolean or
    disjunction queries. (Koji Sekiguchi, Mark Miller)

New features

 1. LUCENE-1531: Added support for BoostingTermQuery to XML query parser. (Karl Wettin)

 2. LUCENE-1435: Added contrib/collation, a CollationKeyFilter
    allowing you to convert tokens into CollationKeys encoded usign
    IndexableBinaryStringTools.  This allows for faster RangQuery when
    a field needs to use a custom Collator.  (Steven Rowe via Mike
    McCandless)

 3. LUCENE-1591: EnWikiDocMaker, LineDocMaker, WriteLineDoc can now
    read/write bz2 using Apache commons compress library.  This means
    you can download the .bz2 export from http://wikipedia.org and
    immediately index it.  (Shai Erera via Mike McCandless)

 4. LUCENE-1629: Add SmartChineseAnalyzer to contrib/analyzers.  It
    improves on CJKAnalyzer and ChineseAnalyzer by handling Chinese
    sentences properly.  SmartChineseAnalyzer uses a Hidden Markov
    Model to tokenize Chinese words in a more intelligent way.
    (Xiaoping Gao via Mike McCandless)

 5. LUCENE-1676: Added DelimitedPayloadTokenFilter class for automatically adding payloads "in-stream" (Grant Ingersoll)

 6. LUCENE-1578: Support for loading unoptimized readers to the
    constructor of InstantiatedIndex. (Karl Wettin)

 7. LUCENE-1704: Allow specifying the Tidy configuration file when
    parsing HTML docs with contrib/ant.  (Keith Sprochi via Mike
    McCandless)

 8. LUCENE-1522: Added contrib/fast-vector-highlighter, a new alternative
    highlighter.  (Koji Sekiguchi via Mike McCandless)

 9. LUCENE-1740: Added "analyzer" command to Lucli, enabling changing
    the analyzer from the default StandardAnalyzer.  (Bernd Fondermann
    via Mike McCandless)

10. LUCENE-1272: Add get/setBoost to MoreLikeThis. (Jonathan
    Leibiusky via Mike McCandless)

Optimizations

  1. LUCENE-1643: Re-use the collation key (RawCollationKey) for
     better performance, in ICUCollationKeyFilter.  (Robert Muir via
     Mike McCandless)

Documentation

 (None)

Build

 (None)

Test Cases

 (None)

======================= Release 2.4.0 2008-10-06 =======================

Changes in runtime behavior

 (None)

API Changes

 1.

 (None)

Bug fixes

 1. LUCENE-1312: Added full support for InstantiatedIndexReader#getFieldNames()
    and tests that assert that deleted documents behaves as they should (they did).
    (Jason Rutherglen, Karl Wettin)

 2. LUCENE-1318: InstantiatedIndexReader.norms(String, b[], int) didn't treat
    the array offset right. (Jason Rutherglen via Karl Wettin)

New features

 1. LUCENE-1320: ShingleMatrixFilter, multidimensional shingle token filter. (Karl Wettin)

 2. LUCENE-1142: Updated Snowball package, org.tartarus distribution revision 500.
    Introducing Hungarian, Turkish and Romanian support, updated older stemmers
    and optimized (reflectionless) SnowballFilter.
    IMPORTANT NOTICE ON BACKWARDS COMPATIBILITY: an index created using the 2.3.2 (or older)
    might not be compatible with these updated classes as some algorithms have changed.
    (Karl Wettin)

 3. LUCENE-1016: TermVectorAccessor, transparent vector space access via stored vectors
    or by resolving the inverted index. (Karl Wettin)

Documentation

 (None)

Build

 (None)

Test Cases

 (None)