mirror of https://github.com/apache/lucene.git
sync CHANGEs for 3.1
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1087056 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
a4c7a88834
commit
9fdc41f0f8
|
@ -393,7 +393,7 @@ Optimizations
|
||||||
* LUCENE-2990: ArrayUtil/CollectionUtil.*Sort() methods now exit early
|
* LUCENE-2990: ArrayUtil/CollectionUtil.*Sort() methods now exit early
|
||||||
on empty or one-element lists/arrays. (Uwe Schindler)
|
on empty or one-element lists/arrays. (Uwe Schindler)
|
||||||
|
|
||||||
======================= Lucene 3.1 (not yet released) =======================
|
======================= Lucene 3.1.0 =======================
|
||||||
|
|
||||||
Changes in backwards compatibility policy
|
Changes in backwards compatibility policy
|
||||||
|
|
||||||
|
@ -409,7 +409,7 @@ Changes in backwards compatibility policy
|
||||||
|
|
||||||
* LUCENE-2190: Removed deprecated customScore() and customExplain()
|
* LUCENE-2190: Removed deprecated customScore() and customExplain()
|
||||||
methods from experimental CustomScoreQuery. (Uwe Schindler)
|
methods from experimental CustomScoreQuery. (Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2286: Enabled DefaultSimilarity.setDiscountOverlaps by default.
|
* LUCENE-2286: Enabled DefaultSimilarity.setDiscountOverlaps by default.
|
||||||
This means that terms with a position increment gap of zero do not
|
This means that terms with a position increment gap of zero do not
|
||||||
affect the norms calculation by default. (Robert Muir)
|
affect the norms calculation by default. (Robert Muir)
|
||||||
|
@ -447,10 +447,10 @@ Changes in backwards compatibility policy
|
||||||
actual file's length if the file exists, and throws FileNotFoundException
|
actual file's length if the file exists, and throws FileNotFoundException
|
||||||
otherwise. Returning length=0 for a non-existent file is no longer allowed. If
|
otherwise. Returning length=0 for a non-existent file is no longer allowed. If
|
||||||
you relied on that, make sure to catch the exception. (Shai Erera)
|
you relied on that, make sure to catch the exception. (Shai Erera)
|
||||||
|
|
||||||
* LUCENE-2386: IndexWriter no longer performs an empty commit upon new index
|
* LUCENE-2386: IndexWriter no longer performs an empty commit upon new index
|
||||||
creation. Previously, if you passed an empty Directory and set OpenMode to
|
creation. Previously, if you passed an empty Directory and set OpenMode to
|
||||||
CREATE*, IndexWriter would make a first empty commit. If you need that
|
CREATE*, IndexWriter would make a first empty commit. If you need that
|
||||||
behavior you can call writer.commit()/close() immediately after you create it.
|
behavior you can call writer.commit()/close() immediately after you create it.
|
||||||
(Shai Erera, Mike McCandless)
|
(Shai Erera, Mike McCandless)
|
||||||
|
|
||||||
|
@ -466,10 +466,10 @@ Changes in backwards compatibility policy
|
||||||
values in multi-valued field has been changed for some cases in index.
|
values in multi-valued field has been changed for some cases in index.
|
||||||
If you index empty fields and uses positions/offsets information on that
|
If you index empty fields and uses positions/offsets information on that
|
||||||
fields, reindex is recommended. (David Smiley, Koji Sekiguchi)
|
fields, reindex is recommended. (David Smiley, Koji Sekiguchi)
|
||||||
|
|
||||||
* LUCENE-2804: Directory.setLockFactory new declares throwing an IOException.
|
* LUCENE-2804: Directory.setLockFactory new declares throwing an IOException.
|
||||||
(Shai Erera, Robert Muir)
|
(Shai Erera, Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2837: Added deprecations noting that in 4.0, Searcher and
|
* LUCENE-2837: Added deprecations noting that in 4.0, Searcher and
|
||||||
Searchable are collapsed into IndexSearcher; contrib/remote and
|
Searchable are collapsed into IndexSearcher; contrib/remote and
|
||||||
MultiSearcher have been removed. (Mike McCandless)
|
MultiSearcher have been removed. (Mike McCandless)
|
||||||
|
@ -496,7 +496,7 @@ Changes in runtime behavior
|
||||||
* LUCENE-2179: CharArraySet.clear() is now functional.
|
* LUCENE-2179: CharArraySet.clear() is now functional.
|
||||||
(Robert Muir, Uwe Schindler)
|
(Robert Muir, Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2455: IndexWriter.addIndexes no longer optimizes the target index
|
* LUCENE-2455: IndexWriter.addIndexes no longer optimizes the target index
|
||||||
before it adds the new ones. Also, the existing segments are not merged and so
|
before it adds the new ones. Also, the existing segments are not merged and so
|
||||||
the index will not end up with a single segment (unless it was empty before).
|
the index will not end up with a single segment (unless it was empty before).
|
||||||
In addition, addIndexesNoOptimize was renamed to addIndexes and no longer
|
In addition, addIndexesNoOptimize was renamed to addIndexes and no longer
|
||||||
|
@ -515,9 +515,9 @@ Changes in runtime behavior
|
||||||
usage, allowing applications to accidentally open two writers on the
|
usage, allowing applications to accidentally open two writers on the
|
||||||
same directory. (Mike McCandless)
|
same directory. (Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-2701: maxMergeMBForOptimize and maxMergeDocs constraints set on
|
* LUCENE-2701: maxMergeMBForOptimize and maxMergeDocs constraints set on
|
||||||
LogMergePolicy now affect optimize() as well (as opposed to only regular
|
LogMergePolicy now affect optimize() as well (as opposed to only regular
|
||||||
merges). This means that you can run optimize() and too large segments won't
|
merges). This means that you can run optimize() and too large segments won't
|
||||||
be merged. (Shai Erera)
|
be merged. (Shai Erera)
|
||||||
|
|
||||||
* LUCENE-2753: IndexReader and DirectoryReader .listCommits() now return a List,
|
* LUCENE-2753: IndexReader and DirectoryReader .listCommits() now return a List,
|
||||||
|
@ -527,9 +527,9 @@ Changes in runtime behavior
|
||||||
the IndexSearcher search methods that take an int nDocs will now
|
the IndexSearcher search methods that take an int nDocs will now
|
||||||
throw IllegalArgumentException if nDocs is 0. Instead, you should
|
throw IllegalArgumentException if nDocs is 0. Instead, you should
|
||||||
use the newly added TotalHitCountCollector. (Mike McCandless)
|
use the newly added TotalHitCountCollector. (Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-2790: LogMergePolicy.useCompoundFile's logic now factors in noCFSRatio
|
* LUCENE-2790: LogMergePolicy.useCompoundFile's logic now factors in noCFSRatio
|
||||||
to determine whether the passed in segment should be compound.
|
to determine whether the passed in segment should be compound.
|
||||||
(Shai Erera, Earwin Burrfoot)
|
(Shai Erera, Earwin Burrfoot)
|
||||||
|
|
||||||
* LUCENE-2805: IndexWriter now increments the index version on every change to
|
* LUCENE-2805: IndexWriter now increments the index version on every change to
|
||||||
|
@ -549,7 +549,7 @@ Changes in runtime behavior
|
||||||
|
|
||||||
* LUCENE-2010: Segments with 100% deleted documents are now removed on
|
* LUCENE-2010: Segments with 100% deleted documents are now removed on
|
||||||
IndexReader or IndexWriter commit. (Uwe Schindler, Mike McCandless)
|
IndexReader or IndexWriter commit. (Uwe Schindler, Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-2960: Allow some changes to IndexWriterConfig to take effect
|
* LUCENE-2960: Allow some changes to IndexWriterConfig to take effect
|
||||||
"live" (after an IW is instantiated), via
|
"live" (after an IW is instantiated), via
|
||||||
IndexWriter.getConfig().setXXX(...) (Shay Banon, Mike McCandless)
|
IndexWriter.getConfig().setXXX(...) (Shay Banon, Mike McCandless)
|
||||||
|
@ -567,7 +567,7 @@ API Changes
|
||||||
|
|
||||||
* LUCENE-2103: NoLockFactory should have a private constructor;
|
* LUCENE-2103: NoLockFactory should have a private constructor;
|
||||||
until Lucene 4.0 the default one will be deprecated.
|
until Lucene 4.0 the default one will be deprecated.
|
||||||
(Shai Erera via Uwe Schindler)
|
(Shai Erera via Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2177: Deprecate the Field ctors that take byte[] and Store.
|
* LUCENE-2177: Deprecate the Field ctors that take byte[] and Store.
|
||||||
Since the removal of compressed fields, Store can only be YES, so
|
Since the removal of compressed fields, Store can only be YES, so
|
||||||
|
@ -587,30 +587,30 @@ API Changes
|
||||||
files are no longer open by IndexReaders. (luocanrao via Mike
|
files are no longer open by IndexReaders. (luocanrao via Mike
|
||||||
McCandless)
|
McCandless)
|
||||||
|
|
||||||
* LUCENE-2282: IndexFileNames is exposed as a public class allowing for easier
|
* LUCENE-2282: IndexFileNames is exposed as a public class allowing for easier
|
||||||
use by external code. In addition it offers a matchExtension method which
|
use by external code. In addition it offers a matchExtension method which
|
||||||
callers can use to query whether a certain file matches a certain extension.
|
callers can use to query whether a certain file matches a certain extension.
|
||||||
(Shai Erera via Mike McCandless)
|
(Shai Erera via Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-124: Add a TopTermsBoostOnlyBooleanQueryRewrite to MultiTermQuery.
|
* LUCENE-124: Add a TopTermsBoostOnlyBooleanQueryRewrite to MultiTermQuery.
|
||||||
This rewrite method is similar to TopTermsScoringBooleanQueryRewrite, but
|
This rewrite method is similar to TopTermsScoringBooleanQueryRewrite, but
|
||||||
only scores terms by their boost values. For example, this can be used
|
only scores terms by their boost values. For example, this can be used
|
||||||
with FuzzyQuery to ensure that exact matches are always scored higher,
|
with FuzzyQuery to ensure that exact matches are always scored higher,
|
||||||
because only the boost will be used in scoring. (Robert Muir)
|
because only the boost will be used in scoring. (Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2015: Add a static method foldToASCII to ASCIIFoldingFilter to
|
* LUCENE-2015: Add a static method foldToASCII to ASCIIFoldingFilter to
|
||||||
expose its folding logic. (Cédrik Lime via Robert Muir)
|
expose its folding logic. (Cédrik Lime via Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2294: IndexWriter constructors have been deprecated in favor of a
|
* LUCENE-2294: IndexWriter constructors have been deprecated in favor of a
|
||||||
single ctor which accepts IndexWriterConfig and a Directory. You can set all
|
single ctor which accepts IndexWriterConfig and a Directory. You can set all
|
||||||
the parameters related to IndexWriter on IndexWriterConfig. The different
|
the parameters related to IndexWriter on IndexWriterConfig. The different
|
||||||
setter/getter methods were deprecated as well. One should call
|
setter/getter methods were deprecated as well. One should call
|
||||||
writer.getConfig().getXYZ() to query for a parameter XYZ.
|
writer.getConfig().getXYZ() to query for a parameter XYZ.
|
||||||
Additionally, the setter/getter related to MergePolicy were deprecated as
|
Additionally, the setter/getter related to MergePolicy were deprecated as
|
||||||
well. One should interact with the MergePolicy directly.
|
well. One should interact with the MergePolicy directly.
|
||||||
(Shai Erera via Mike McCandless)
|
(Shai Erera via Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-2320: IndexWriter's MergePolicy configuration was moved to
|
* LUCENE-2320: IndexWriter's MergePolicy configuration was moved to
|
||||||
IndexWriterConfig and the respective methods on IndexWriter were deprecated.
|
IndexWriterConfig and the respective methods on IndexWriter were deprecated.
|
||||||
(Shai Erera via Mike McCandless)
|
(Shai Erera via Mike McCandless)
|
||||||
|
|
||||||
|
@ -634,14 +634,14 @@ API Changes
|
||||||
* LUCENE-2402: IndexWriter.deleteUnusedFiles now deletes unreferenced commit
|
* LUCENE-2402: IndexWriter.deleteUnusedFiles now deletes unreferenced commit
|
||||||
points too. If you use an IndexDeletionPolicy which holds onto index commits
|
points too. If you use an IndexDeletionPolicy which holds onto index commits
|
||||||
(such as SnapshotDeletionPolicy), you can call this method to remove those
|
(such as SnapshotDeletionPolicy), you can call this method to remove those
|
||||||
commit points when they are not needed anymore (instead of waiting for the
|
commit points when they are not needed anymore (instead of waiting for the
|
||||||
next commit). (Shai Erera)
|
next commit). (Shai Erera)
|
||||||
|
|
||||||
* LUCENE-2481: SnapshotDeletionPolicy.snapshot() and release() were replaced
|
* LUCENE-2481: SnapshotDeletionPolicy.snapshot() and release() were replaced
|
||||||
with equivalent ones that take a String (id) as argument. You can pass
|
with equivalent ones that take a String (id) as argument. You can pass
|
||||||
whatever ID you want, as long as you use the same one when calling both.
|
whatever ID you want, as long as you use the same one when calling both.
|
||||||
(Shai Erera)
|
(Shai Erera)
|
||||||
|
|
||||||
* LUCENE-2356: Add IndexWriterConfig.set/getReaderTermIndexDivisor, to
|
* LUCENE-2356: Add IndexWriterConfig.set/getReaderTermIndexDivisor, to
|
||||||
set what IndexWriter passes for termsIndexDivisor to the readers it
|
set what IndexWriter passes for termsIndexDivisor to the readers it
|
||||||
opens internally when apply deletions or creating a near-real-time
|
opens internally when apply deletions or creating a near-real-time
|
||||||
|
@ -651,7 +651,7 @@ API Changes
|
||||||
in common/standard/ now implement the Word Break rules from the Unicode 6.0.0
|
in common/standard/ now implement the Word Break rules from the Unicode 6.0.0
|
||||||
Text Segmentation algorithm (UAX#29), covering the full range of Unicode code
|
Text Segmentation algorithm (UAX#29), covering the full range of Unicode code
|
||||||
points, including values from U+FFFF to U+10FFFF
|
points, including values from U+FFFF to U+10FFFF
|
||||||
|
|
||||||
ClassicTokenizer/Analyzer retains the old (pre-Lucene 3.1) StandardTokenizer/
|
ClassicTokenizer/Analyzer retains the old (pre-Lucene 3.1) StandardTokenizer/
|
||||||
Analyzer implementation and behavior. Only the Unicode Basic Multilingual
|
Analyzer implementation and behavior. Only the Unicode Basic Multilingual
|
||||||
Plane (code points from U+0000 to U+FFFF) is covered.
|
Plane (code points from U+0000 to U+FFFF) is covered.
|
||||||
|
@ -659,16 +659,16 @@ API Changes
|
||||||
UAX29URLEmailTokenizer tokenizes URLs and E-mail addresses according to the
|
UAX29URLEmailTokenizer tokenizes URLs and E-mail addresses according to the
|
||||||
relevant RFCs, in addition to implementing the UAX#29 Word Break rules.
|
relevant RFCs, in addition to implementing the UAX#29 Word Break rules.
|
||||||
(Steven Rowe, Robert Muir, Uwe Schindler)
|
(Steven Rowe, Robert Muir, Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2778: RAMDirectory now exposes newRAMFile() which allows to override
|
* LUCENE-2778: RAMDirectory now exposes newRAMFile() which allows to override
|
||||||
and return a different RAMFile implementation. (Shai Erera)
|
and return a different RAMFile implementation. (Shai Erera)
|
||||||
|
|
||||||
* LUCENE-2785: Added TotalHitCountCollector whose sole purpose is to
|
* LUCENE-2785: Added TotalHitCountCollector whose sole purpose is to
|
||||||
count the number of hits matching the query. (Mike McCandless)
|
count the number of hits matching the query. (Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-2846: Deprecated IndexReader.setNorm(int, String, float). This method
|
* LUCENE-2846: Deprecated IndexReader.setNorm(int, String, float). This method
|
||||||
is only syntactic sugar for setNorm(int, String, byte), but using the global
|
is only syntactic sugar for setNorm(int, String, byte), but using the global
|
||||||
Similarity.getDefault().encodeNormValue(). Use the byte-based method instead
|
Similarity.getDefault().encodeNormValue(). Use the byte-based method instead
|
||||||
to ensure that the norm is encoded with your Similarity.
|
to ensure that the norm is encoded with your Similarity.
|
||||||
(Robert Muir, Mike McCandless)
|
(Robert Muir, Mike McCandless)
|
||||||
|
|
||||||
|
@ -689,6 +689,9 @@ API Changes
|
||||||
for AttributeImpls, but can still be provided (if needed).
|
for AttributeImpls, but can still be provided (if needed).
|
||||||
(Uwe Schindler)
|
(Uwe Schindler)
|
||||||
|
|
||||||
|
* LUCENE-2691: Deprecate IndexWriter.getReader in favor of
|
||||||
|
IndexReader.open(IndexWriter) (Grant Ingersoll, Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-2876: Deprecated Scorer.getSimilarity(). If your Scorer uses a Similarity,
|
* LUCENE-2876: Deprecated Scorer.getSimilarity(). If your Scorer uses a Similarity,
|
||||||
it should keep it itself. Fixed Scorers to pass their parent Weight, so that
|
it should keep it itself. Fixed Scorers to pass their parent Weight, so that
|
||||||
Scorer.visitSubScorers (LUCENE-2590) will work correctly.
|
Scorer.visitSubScorers (LUCENE-2590) will work correctly.
|
||||||
|
@ -700,7 +703,7 @@ API Changes
|
||||||
expert use cases can handle seeing deleted documents returned. The
|
expert use cases can handle seeing deleted documents returned. The
|
||||||
deletes remain buffered so that the next time you open an NRT reader
|
deletes remain buffered so that the next time you open an NRT reader
|
||||||
and pass true, all deletes will be a applied. (Mike McCandless)
|
and pass true, all deletes will be a applied. (Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-1253: LengthFilter (and Solr's KeepWordTokenFilter) now
|
* LUCENE-1253: LengthFilter (and Solr's KeepWordTokenFilter) now
|
||||||
require up front specification of enablePositionIncrement. Together with
|
require up front specification of enablePositionIncrement. Together with
|
||||||
StopFilter they have a common base class (FilteringTokenFilter) that handles
|
StopFilter they have a common base class (FilteringTokenFilter) that handles
|
||||||
|
@ -711,7 +714,7 @@ Bug fixes
|
||||||
|
|
||||||
* LUCENE-2249: ParallelMultiSearcher should shut down thread pool on
|
* LUCENE-2249: ParallelMultiSearcher should shut down thread pool on
|
||||||
close. (Martin Traverso via Uwe Schindler)
|
close. (Martin Traverso via Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2273: FieldCacheImpl.getCacheEntries() used WeakHashMap
|
* LUCENE-2273: FieldCacheImpl.getCacheEntries() used WeakHashMap
|
||||||
incorrectly and lead to ConcurrentModificationException.
|
incorrectly and lead to ConcurrentModificationException.
|
||||||
(Uwe Schindler, Robert Muir)
|
(Uwe Schindler, Robert Muir)
|
||||||
|
@ -722,7 +725,7 @@ Bug fixes
|
||||||
|
|
||||||
* LUCENE-2074: Reduce buffer size of lexer back to default on reset.
|
* LUCENE-2074: Reduce buffer size of lexer back to default on reset.
|
||||||
(Ruben Laguna, Shai Erera via Uwe Schindler)
|
(Ruben Laguna, Shai Erera via Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2496: Don't throw NPE if IndexWriter is opened with CREATE on
|
* LUCENE-2496: Don't throw NPE if IndexWriter is opened with CREATE on
|
||||||
a prior (corrupt) index missing its segments_N file. (Mike
|
a prior (corrupt) index missing its segments_N file. (Mike
|
||||||
McCandless)
|
McCandless)
|
||||||
|
@ -731,10 +734,10 @@ Bug fixes
|
||||||
assuming whitespace tokenization. Previously all CJK queries, for example,
|
assuming whitespace tokenization. Previously all CJK queries, for example,
|
||||||
would be turned into phrase queries. The old behavior is preserved with
|
would be turned into phrase queries. The old behavior is preserved with
|
||||||
the matchVersion parameter for previous versions. Additionally, you can
|
the matchVersion parameter for previous versions. Additionally, you can
|
||||||
explicitly enable the old behavior with setAutoGeneratePhraseQueries(true)
|
explicitly enable the old behavior with setAutoGeneratePhraseQueries(true)
|
||||||
(Robert Muir)
|
(Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2537: FSDirectory.copy() implementation was unsafe and could result in
|
* LUCENE-2537: FSDirectory.copy() implementation was unsafe and could result in
|
||||||
OOM if a large file was copied. (Shai Erera)
|
OOM if a large file was copied. (Shai Erera)
|
||||||
|
|
||||||
* LUCENE-2580: MultiPhraseQuery throws AIOOBE if number of positions
|
* LUCENE-2580: MultiPhraseQuery throws AIOOBE if number of positions
|
||||||
|
@ -752,14 +755,14 @@ Bug fixes
|
||||||
|
|
||||||
* LUCENE-2802: NRT DirectoryReader returned incorrect values from
|
* LUCENE-2802: NRT DirectoryReader returned incorrect values from
|
||||||
getVersion, isOptimized, getCommitUserData, getIndexCommit and isCurrent due
|
getVersion, isOptimized, getCommitUserData, getIndexCommit and isCurrent due
|
||||||
to a mutable reference to the IndexWriters SegmentInfos.
|
to a mutable reference to the IndexWriters SegmentInfos.
|
||||||
(Simon Willnauer, Earwin Burrfoot)
|
(Simon Willnauer, Earwin Burrfoot)
|
||||||
|
|
||||||
* LUCENE-2852: Fixed corner case in RAMInputStream that would hit a
|
* LUCENE-2852: Fixed corner case in RAMInputStream that would hit a
|
||||||
false EOF after seeking to EOF then seeking back to same block you
|
false EOF after seeking to EOF then seeking back to same block you
|
||||||
were just in and then calling readBytes (Robert Muir, Mike McCandless)
|
were just in and then calling readBytes (Robert Muir, Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-2860: Fixed SegmentInfo.sizeInBytes to factor includeDocStores when it
|
* LUCENE-2860: Fixed SegmentInfo.sizeInBytes to factor includeDocStores when it
|
||||||
decides whether to return the cached computed size or not. (Shai Erera)
|
decides whether to return the cached computed size or not. (Shai Erera)
|
||||||
|
|
||||||
* LUCENE-2584: SegmentInfo.files() could hit ConcurrentModificationException if
|
* LUCENE-2584: SegmentInfo.files() could hit ConcurrentModificationException if
|
||||||
|
@ -772,7 +775,7 @@ Bug fixes
|
||||||
internally, it now calls Similarity.idfExplain(Collection, IndexSearcher).
|
internally, it now calls Similarity.idfExplain(Collection, IndexSearcher).
|
||||||
(Robert Muir)
|
(Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2693: RAM used by IndexWriter was slightly incorrectly computed.
|
* LUCENE-2693: RAM used by IndexWriter was slightly incorrectly computed.
|
||||||
(Jason Rutherglen via Shai Erera)
|
(Jason Rutherglen via Shai Erera)
|
||||||
|
|
||||||
* LUCENE-1846: DateTools now uses the US locale everywhere, so DateTools.round()
|
* LUCENE-1846: DateTools now uses the US locale everywhere, so DateTools.round()
|
||||||
|
@ -788,6 +791,9 @@ Bug fixes
|
||||||
been rounded down to 0 instead of being rounded up to the smallest
|
been rounded down to 0 instead of being rounded up to the smallest
|
||||||
positive number. (yonik)
|
positive number. (yonik)
|
||||||
|
|
||||||
|
* LUCENE-2936: PhraseQuery score explanations were not correctly
|
||||||
|
identifying matches vs non-matches. (hossman)
|
||||||
|
|
||||||
* LUCENE-2975: A hotspot bug corrupts IndexInput#readVInt()/readVLong() if
|
* LUCENE-2975: A hotspot bug corrupts IndexInput#readVInt()/readVLong() if
|
||||||
the underlying readByte() is inlined (which happens e.g. in MMapDirectory).
|
the underlying readByte() is inlined (which happens e.g. in MMapDirectory).
|
||||||
The loop was unwinded which makes the hotspot bug disappear.
|
The loop was unwinded which makes the hotspot bug disappear.
|
||||||
|
@ -796,30 +802,30 @@ Bug fixes
|
||||||
New features
|
New features
|
||||||
|
|
||||||
* LUCENE-2128: Parallelized fetching document frequencies during weight
|
* LUCENE-2128: Parallelized fetching document frequencies during weight
|
||||||
creation. (Israel Tsadok, Simon Willnauer via Uwe Schindler)
|
creation. (Israel Tsadok, Simon Willnauer via Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2069: Added Unicode 4 support to CharArraySet. Due to the switch
|
* LUCENE-2069: Added Unicode 4 support to CharArraySet. Due to the switch
|
||||||
to Java 5, supplementary characters are now lowercased correctly if the
|
to Java 5, supplementary characters are now lowercased correctly if the
|
||||||
set is created as case insensitive.
|
set is created as case insensitive.
|
||||||
CharArraySet now requires a Version argument to preserve
|
CharArraySet now requires a Version argument to preserve
|
||||||
backwards compatibility. If Version < 3.1 is passed to the constructor,
|
backwards compatibility. If Version < 3.1 is passed to the constructor,
|
||||||
CharArraySet yields the old behavior. (Simon Willnauer)
|
CharArraySet yields the old behavior. (Simon Willnauer)
|
||||||
|
|
||||||
* LUCENE-2069: Added Unicode 4 support to LowerCaseFilter. Due to the switch
|
* LUCENE-2069: Added Unicode 4 support to LowerCaseFilter. Due to the switch
|
||||||
to Java 5, supplementary characters are now lowercased correctly.
|
to Java 5, supplementary characters are now lowercased correctly.
|
||||||
LowerCaseFilter now requires a Version argument to preserve
|
LowerCaseFilter now requires a Version argument to preserve
|
||||||
backwards compatibility. If Version < 3.1 is passed to the constructor,
|
backwards compatibility. If Version < 3.1 is passed to the constructor,
|
||||||
LowerCaseFilter yields the old behavior. (Simon Willnauer, Robert Muir)
|
LowerCaseFilter yields the old behavior. (Simon Willnauer, Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2034: Added ReusableAnalyzerBase, an abstract subclass of Analyzer
|
* LUCENE-2034: Added ReusableAnalyzerBase, an abstract subclass of Analyzer
|
||||||
that makes it easier to reuse TokenStreams correctly. This issue also added
|
that makes it easier to reuse TokenStreams correctly. This issue also added
|
||||||
StopwordAnalyzerBase, which improves consistency of all Analyzers that use
|
StopwordAnalyzerBase, which improves consistency of all Analyzers that use
|
||||||
stopwords, and implement many analyzers in contrib with it.
|
stopwords, and implement many analyzers in contrib with it.
|
||||||
(Simon Willnauer via Robert Muir)
|
(Simon Willnauer via Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2198, LUCENE-2901: Support protected words in stemming TokenFilters using a
|
* LUCENE-2198, LUCENE-2901: Support protected words in stemming TokenFilters using a
|
||||||
new KeywordAttribute. (Simon Willnauer, Drew Farris via Uwe Schindler)
|
new KeywordAttribute. (Simon Willnauer, Drew Farris via Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2183, LUCENE-2240, LUCENE-2241: Added Unicode 4 support
|
* LUCENE-2183, LUCENE-2240, LUCENE-2241: Added Unicode 4 support
|
||||||
to CharTokenizer and its subclasses. CharTokenizer now has new
|
to CharTokenizer and its subclasses. CharTokenizer now has new
|
||||||
int-API which is conditionally preferred to the old char-API depending
|
int-API which is conditionally preferred to the old char-API depending
|
||||||
|
@ -828,8 +834,8 @@ New features
|
||||||
|
|
||||||
* LUCENE-2247: Added a CharArrayMap<V> for performance improvements
|
* LUCENE-2247: Added a CharArrayMap<V> for performance improvements
|
||||||
in some stemmers and synonym filters. (Uwe Schindler)
|
in some stemmers and synonym filters. (Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2320: Added SetOnce which wraps an object and allows it to be set
|
* LUCENE-2320: Added SetOnce which wraps an object and allows it to be set
|
||||||
exactly once. (Shai Erera via Mike McCandless)
|
exactly once. (Shai Erera via Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-2314: Added AttributeSource.copyTo(AttributeSource) that
|
* LUCENE-2314: Added AttributeSource.copyTo(AttributeSource) that
|
||||||
|
@ -856,19 +862,19 @@ New features
|
||||||
Directory.copyTo, and use nio's FileChannel.transferTo when copying
|
Directory.copyTo, and use nio's FileChannel.transferTo when copying
|
||||||
files between FSDirectory instances. (Earwin Burrfoot via Mike
|
files between FSDirectory instances. (Earwin Burrfoot via Mike
|
||||||
McCandless).
|
McCandless).
|
||||||
|
|
||||||
* LUCENE-2074: Make StandardTokenizer fit for Unicode 4.0, if the
|
* LUCENE-2074: Make StandardTokenizer fit for Unicode 4.0, if the
|
||||||
matchVersion parameter is Version.LUCENE_31. (Uwe Schindler)
|
matchVersion parameter is Version.LUCENE_31. (Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2385: Moved NoDeletionPolicy from benchmark to core. NoDeletionPolicy
|
* LUCENE-2385: Moved NoDeletionPolicy from benchmark to core. NoDeletionPolicy
|
||||||
can be used to prevent commits from ever getting deleted from the index.
|
can be used to prevent commits from ever getting deleted from the index.
|
||||||
(Shai Erera)
|
(Shai Erera)
|
||||||
|
|
||||||
* LUCENE-1585: IndexWriter now accepts a PayloadProcessorProvider which can
|
* LUCENE-1585: IndexWriter now accepts a PayloadProcessorProvider which can
|
||||||
return a DirPayloadProcessor for a given Directory, which returns a
|
return a DirPayloadProcessor for a given Directory, which returns a
|
||||||
PayloadProcessor for a given Term. The PayloadProcessor will be used to
|
PayloadProcessor for a given Term. The PayloadProcessor will be used to
|
||||||
process the payloads of the segments as they are merged (e.g. if one wants to
|
process the payloads of the segments as they are merged (e.g. if one wants to
|
||||||
rewrite payloads of external indexes as they are added, or of local ones).
|
rewrite payloads of external indexes as they are added, or of local ones).
|
||||||
(Shai Erera, Michael Busch, Mike McCandless)
|
(Shai Erera, Michael Busch, Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-2440: Add support for custom ExecutorService in
|
* LUCENE-2440: Add support for custom ExecutorService in
|
||||||
|
@ -881,7 +887,7 @@ New features
|
||||||
|
|
||||||
* LUCENE-2526: Don't throw NPE from MultiPhraseQuery.toString when
|
* LUCENE-2526: Don't throw NPE from MultiPhraseQuery.toString when
|
||||||
it's empty. (Ross Woolf via Mike McCandless)
|
it's empty. (Ross Woolf via Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-2559: Added SegmentReader.reopen methods (John Wang via Mike
|
* LUCENE-2559: Added SegmentReader.reopen methods (John Wang via Mike
|
||||||
McCandless)
|
McCandless)
|
||||||
|
|
||||||
|
@ -897,17 +903,20 @@ New features
|
||||||
to add span support: SpanMultiTermQueryWrapper<Q extends MultiTermQuery>.
|
to add span support: SpanMultiTermQueryWrapper<Q extends MultiTermQuery>.
|
||||||
Using this wrapper its easy to add fuzzy/wildcard to e.g. a SpanNearQuery.
|
Using this wrapper its easy to add fuzzy/wildcard to e.g. a SpanNearQuery.
|
||||||
(Robert Muir, Uwe Schindler)
|
(Robert Muir, Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2838: ConstantScoreQuery now directly supports wrapping a Query
|
* LUCENE-2838: ConstantScoreQuery now directly supports wrapping a Query
|
||||||
instance for stripping off scores. The use of a QueryWrapperFilter
|
instance for stripping off scores. The use of a QueryWrapperFilter
|
||||||
is no longer needed and discouraged for that use case. Directly wrapping
|
is no longer needed and discouraged for that use case. Directly wrapping
|
||||||
Query improves performance, as out-of-order collection is now supported.
|
Query improves performance, as out-of-order collection is now supported.
|
||||||
(Uwe Schindler)
|
(Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2864: Add getMaxTermFrequency (maximum within-document TF) to
|
* LUCENE-2864: Add getMaxTermFrequency (maximum within-document TF) to
|
||||||
FieldInvertState so that it can be used in Similarity.computeNorm.
|
FieldInvertState so that it can be used in Similarity.computeNorm.
|
||||||
(Robert Muir)
|
(Robert Muir)
|
||||||
|
|
||||||
|
* LUCENE-2720: Segments now record the code version which created them.
|
||||||
|
(Shai Erera, Mike McCandless, Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2474: Added expert ReaderFinishedListener API to
|
* LUCENE-2474: Added expert ReaderFinishedListener API to
|
||||||
IndexReader, to allow apps that maintain external per-segment caches
|
IndexReader, to allow apps that maintain external per-segment caches
|
||||||
to evict entries when a segment is finished. (Shay Banon, Yonik
|
to evict entries when a segment is finished. (Shay Banon, Yonik
|
||||||
|
@ -916,8 +925,8 @@ New features
|
||||||
* LUCENE-2911: The new StandardTokenizer, UAX29URLEmailTokenizer, and
|
* LUCENE-2911: The new StandardTokenizer, UAX29URLEmailTokenizer, and
|
||||||
the ICUTokenizer in contrib now all tag types with a consistent set
|
the ICUTokenizer in contrib now all tag types with a consistent set
|
||||||
of token types (defined in StandardTokenizer). Tokens in the major
|
of token types (defined in StandardTokenizer). Tokens in the major
|
||||||
CJK types are explicitly marked to allow for custom downstream handling:
|
CJK types are explicitly marked to allow for custom downstream handling:
|
||||||
<IDEOGRAPHIC>, <HANGUL>, <KATAKANA>, and <HIRAGANA>.
|
<IDEOGRAPHIC>, <HANGUL>, <KATAKANA>, and <HIRAGANA>.
|
||||||
(Robert Muir, Steven Rowe)
|
(Robert Muir, Steven Rowe)
|
||||||
|
|
||||||
* LUCENE-2913: Add missing getters to Numeric* classes. (Uwe Schindler)
|
* LUCENE-2913: Add missing getters to Numeric* classes. (Uwe Schindler)
|
||||||
|
@ -942,7 +951,7 @@ Optimizations
|
||||||
* LUCENE-2137: Switch to AtomicInteger for some ref counting (Earwin
|
* LUCENE-2137: Switch to AtomicInteger for some ref counting (Earwin
|
||||||
Burrfoot via Mike McCandless)
|
Burrfoot via Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-2123, LUCENE-2261: Move FuzzyQuery rewrite to separate RewriteMode
|
* LUCENE-2123, LUCENE-2261: Move FuzzyQuery rewrite to separate RewriteMode
|
||||||
into MultiTermQuery. The number of fuzzy expansions can be specified with
|
into MultiTermQuery. The number of fuzzy expansions can be specified with
|
||||||
the maxExpansions parameter to FuzzyQuery.
|
the maxExpansions parameter to FuzzyQuery.
|
||||||
(Uwe Schindler, Robert Muir, Mike McCandless)
|
(Uwe Schindler, Robert Muir, Mike McCandless)
|
||||||
|
@ -976,12 +985,12 @@ Optimizations
|
||||||
TermAttributeImpl, move DEFAULT_TYPE constant to TypeInterface, improve
|
TermAttributeImpl, move DEFAULT_TYPE constant to TypeInterface, improve
|
||||||
null-handling for TypeAttribute. (Uwe Schindler)
|
null-handling for TypeAttribute. (Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2329: Switch TermsHash* from using a PostingList object per unique
|
* LUCENE-2329: Switch TermsHash* from using a PostingList object per unique
|
||||||
term to parallel arrays, indexed by termID. This reduces garbage collection
|
term to parallel arrays, indexed by termID. This reduces garbage collection
|
||||||
overhead significantly, which results in great indexing performance wins
|
overhead significantly, which results in great indexing performance wins
|
||||||
when the available JVM heap space is low. This will become even more
|
when the available JVM heap space is low. This will become even more
|
||||||
important when the DocumentsWriter RAM buffer is searchable in the future,
|
important when the DocumentsWriter RAM buffer is searchable in the future,
|
||||||
because then it will make sense to make the RAM buffers as large as
|
because then it will make sense to make the RAM buffers as large as
|
||||||
possible. (Mike McCandless, Michael Busch)
|
possible. (Mike McCandless, Michael Busch)
|
||||||
|
|
||||||
* LUCENE-2380: The terms field cache methods (getTerms,
|
* LUCENE-2380: The terms field cache methods (getTerms,
|
||||||
|
@ -996,7 +1005,7 @@ Optimizations
|
||||||
causing too many fallbacks to compare-by-value (instead of by-ord).
|
causing too many fallbacks to compare-by-value (instead of by-ord).
|
||||||
(Mike McCandless)
|
(Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-2574: IndexInput exposes copyBytes(IndexOutput, long) to allow for
|
* LUCENE-2574: IndexInput exposes copyBytes(IndexOutput, long) to allow for
|
||||||
efficient copying by sub-classes. Optimized copy is implemented for RAM and FS
|
efficient copying by sub-classes. Optimized copy is implemented for RAM and FS
|
||||||
streams. (Shai Erera)
|
streams. (Shai Erera)
|
||||||
|
|
||||||
|
@ -1019,15 +1028,15 @@ Optimizations
|
||||||
|
|
||||||
* LUCENE-2010: Segments with 100% deleted documents are now removed on
|
* LUCENE-2010: Segments with 100% deleted documents are now removed on
|
||||||
IndexReader or IndexWriter commit. (Uwe Schindler, Mike McCandless)
|
IndexReader or IndexWriter commit. (Uwe Schindler, Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-1472: Removed synchronization from static DateTools methods
|
* LUCENE-1472: Removed synchronization from static DateTools methods
|
||||||
by using a ThreadLocal. Also converted DateTools.Resolution to a
|
by using a ThreadLocal. Also converted DateTools.Resolution to a
|
||||||
Java 5 enum (this should not break backwards). (Uwe Schindler)
|
Java 5 enum (this should not break backwards). (Uwe Schindler)
|
||||||
|
|
||||||
Build
|
Build
|
||||||
|
|
||||||
* LUCENE-2124: Moved the JDK-based collation support from contrib/collation
|
* LUCENE-2124: Moved the JDK-based collation support from contrib/collation
|
||||||
into core, and moved the ICU-based collation support into contrib/icu.
|
into core, and moved the ICU-based collation support into contrib/icu.
|
||||||
(Robert Muir)
|
(Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2326: Removed SVN checkouts for backwards tests. The backwards
|
* LUCENE-2326: Removed SVN checkouts for backwards tests. The backwards
|
||||||
|
@ -1039,14 +1048,14 @@ Build
|
||||||
|
|
||||||
* LUCENE-1709: Tests are now parallelized by default (except for benchmark). You
|
* LUCENE-1709: Tests are now parallelized by default (except for benchmark). You
|
||||||
can force them to run sequentially by passing -Drunsequential=1 on the command
|
can force them to run sequentially by passing -Drunsequential=1 on the command
|
||||||
line. The number of threads that are spawned per CPU defaults to '1'. If you
|
line. The number of threads that are spawned per CPU defaults to '1'. If you
|
||||||
wish to change that, you can run the tests with -DthreadsPerProcessor=[num].
|
wish to change that, you can run the tests with -DthreadsPerProcessor=[num].
|
||||||
(Robert Muir, Shai Erera, Peter Kofler)
|
(Robert Muir, Shai Erera, Peter Kofler)
|
||||||
|
|
||||||
* LUCENE-2516: Backwards tests are now compiled against released lucene-core.jar
|
* LUCENE-2516: Backwards tests are now compiled against released lucene-core.jar
|
||||||
from tarball of previous version. Backwards tests are now packaged together
|
from tarball of previous version. Backwards tests are now packaged together
|
||||||
with src distribution. (Uwe Schindler)
|
with src distribution. (Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2611: Added Ant target to install IntelliJ IDEA configuration:
|
* LUCENE-2611: Added Ant target to install IntelliJ IDEA configuration:
|
||||||
"ant idea". See http://wiki.apache.org/lucene-java/HowtoConfigureIntelliJ
|
"ant idea". See http://wiki.apache.org/lucene-java/HowtoConfigureIntelliJ
|
||||||
(Steven Rowe)
|
(Steven Rowe)
|
||||||
|
@ -1055,8 +1064,8 @@ Build
|
||||||
generating Maven artifacts (Steven Rowe)
|
generating Maven artifacts (Steven Rowe)
|
||||||
|
|
||||||
* LUCENE-2609: Added jar-test-framework Ant target which packages Lucene's
|
* LUCENE-2609: Added jar-test-framework Ant target which packages Lucene's
|
||||||
tests' framework classes. (Drew Farris, Grant Ingersoll, Shai Erera, Steven
|
tests' framework classes. (Drew Farris, Grant Ingersoll, Shai Erera,
|
||||||
Rowe)
|
Steven Rowe)
|
||||||
|
|
||||||
Test Cases
|
Test Cases
|
||||||
|
|
||||||
|
@ -1092,18 +1101,18 @@ Test Cases
|
||||||
access to "real" files from the test folder itself, can use
|
access to "real" files from the test folder itself, can use
|
||||||
LuceneTestCase(J4).getDataFile(). (Uwe Schindler)
|
LuceneTestCase(J4).getDataFile(). (Uwe Schindler)
|
||||||
|
|
||||||
* LUCENE-2398, LUCENE-2611: Improve tests to work better from IDEs such
|
* LUCENE-2398, LUCENE-2611: Improve tests to work better from IDEs such
|
||||||
as Eclipse and IntelliJ.
|
as Eclipse and IntelliJ.
|
||||||
(Paolo Castagna, Steven Rowe via Robert Muir)
|
(Paolo Castagna, Steven Rowe via Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2804: add newFSDirectory to LuceneTestCase to create a FSDirectory at
|
* LUCENE-2804: add newFSDirectory to LuceneTestCase to create a FSDirectory at
|
||||||
random. (Shai Erera, Robert Muir)
|
random. (Shai Erera, Robert Muir)
|
||||||
|
|
||||||
Documentation
|
Documentation
|
||||||
|
|
||||||
* LUCENE-2579: Fix oal.search's package.html description of abstract
|
* LUCENE-2579: Fix oal.search's package.html description of abstract
|
||||||
methods. (Santiago M. Mola via Mike McCandless)
|
methods. (Santiago M. Mola via Mike McCandless)
|
||||||
|
|
||||||
* LUCENE-2625: Add a note to IndexReader.termDocs() with additional verbiage
|
* LUCENE-2625: Add a note to IndexReader.termDocs() with additional verbiage
|
||||||
that the TermEnum must be seeked since it is unpositioned.
|
that the TermEnum must be seeked since it is unpositioned.
|
||||||
(Adriano Crestani via Robert Muir)
|
(Adriano Crestani via Robert Muir)
|
||||||
|
|
|
@ -47,26 +47,26 @@ API Changes
|
||||||
|
|
||||||
(No changes)
|
(No changes)
|
||||||
|
|
||||||
======================= Lucene 3.1 (not yet released) =======================
|
======================= Lucene 3.1.0 =======================
|
||||||
|
|
||||||
Changes in backwards compatibility policy
|
Changes in backwards compatibility policy
|
||||||
|
|
||||||
* LUCENE-2100: All Analyzers in Lucene-contrib have been marked as final.
|
* LUCENE-2100: All Analyzers in Lucene-contrib have been marked as final.
|
||||||
Analyzers should be only act as a composition of TokenStreams, users should
|
Analyzers should be only act as a composition of TokenStreams, users should
|
||||||
compose their own analyzers instead of subclassing existing ones.
|
compose their own analyzers instead of subclassing existing ones.
|
||||||
(Simon Willnauer)
|
(Simon Willnauer)
|
||||||
|
|
||||||
* LUCENE-2194, LUCENE-2201: Snowball APIs were upgraded to snowball revision
|
* LUCENE-2194, LUCENE-2201: Snowball APIs were upgraded to snowball revision
|
||||||
502 (with some local modifications for improved performance).
|
502 (with some local modifications for improved performance).
|
||||||
Index backwards compatibility and binary backwards compatibility is
|
Index backwards compatibility and binary backwards compatibility is
|
||||||
preserved, but some protected/public member variables changed type. This
|
preserved, but some protected/public member variables changed type. This
|
||||||
does NOT affect java code/class files produced by the snowball compiler,
|
does NOT affect java code/class files produced by the snowball compiler,
|
||||||
but technically is a backwards compatibility break. (Robert Muir)
|
but technically is a backwards compatibility break. (Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2226: Moved contrib/snowball functionality into contrib/analyzers.
|
* LUCENE-2226: Moved contrib/snowball functionality into contrib/analyzers.
|
||||||
Be sure to remove any old obselete lucene-snowball jar files from your
|
Be sure to remove any old obselete lucene-snowball jar files from your
|
||||||
classpath! (Robert Muir)
|
classpath! (Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2323: Moved contrib/wikipedia functionality into contrib/analyzers.
|
* LUCENE-2323: Moved contrib/wikipedia functionality into contrib/analyzers.
|
||||||
Additionally the package was changed from org.apache.lucene.wikipedia.analysis
|
Additionally the package was changed from org.apache.lucene.wikipedia.analysis
|
||||||
to org.apache.lucene.analysis.wikipedia. (Robert Muir)
|
to org.apache.lucene.analysis.wikipedia. (Robert Muir)
|
||||||
|
@ -74,30 +74,30 @@ Changes in backwards compatibility policy
|
||||||
* LUCENE-2581: Added new methods to FragmentsBuilder interface. These methods
|
* LUCENE-2581: Added new methods to FragmentsBuilder interface. These methods
|
||||||
are used to set pre/post tags and Encoder. (Koji Sekiguchi)
|
are used to set pre/post tags and Encoder. (Koji Sekiguchi)
|
||||||
|
|
||||||
* LUCENE-2391: Improved spellchecker (re)build time/ram usage by omitting
|
* LUCENE-2391: Improved spellchecker (re)build time/ram usage by omitting
|
||||||
frequencies/positions/norms for single-valued fields, modifying the default
|
frequencies/positions/norms for single-valued fields, modifying the default
|
||||||
ramBufferMBSize to match IndexWriterConfig (16MB), making index optimization
|
ramBufferMBSize to match IndexWriterConfig (16MB), making index optimization
|
||||||
an optional boolean parameter, and modifying the incremental update logic
|
an optional boolean parameter, and modifying the incremental update logic
|
||||||
to work well with unoptimized spellcheck indexes. The indexDictionary() methods
|
to work well with unoptimized spellcheck indexes. The indexDictionary() methods
|
||||||
were made final to ensure a hard backwards break in case you were subclassing
|
were made final to ensure a hard backwards break in case you were subclassing
|
||||||
Spellchecker. In general, subclassing Spellchecker is not recommended. (Robert Muir)
|
Spellchecker. In general, subclassing Spellchecker is not recommended. (Robert Muir)
|
||||||
|
|
||||||
Changes in runtime behavior
|
Changes in runtime behavior
|
||||||
|
|
||||||
* LUCENE-2117: SnowballAnalyzer uses TurkishLowerCaseFilter instead of
|
* LUCENE-2117: SnowballAnalyzer uses TurkishLowerCaseFilter instead of
|
||||||
LowercaseFilter to correctly handle the unique Turkish casing behavior if
|
LowercaseFilter to correctly handle the unique Turkish casing behavior if
|
||||||
used with Version > 3.0 and the TurkishStemmer.
|
used with Version > 3.0 and the TurkishStemmer.
|
||||||
(Robert Muir via Simon Willnauer)
|
(Robert Muir via Simon Willnauer)
|
||||||
|
|
||||||
* LUCENE-2055: GermanAnalyzer now uses the Snowball German2 algorithm and
|
* LUCENE-2055: GermanAnalyzer now uses the Snowball German2 algorithm and
|
||||||
stopwords list by default for Version > 3.0.
|
stopwords list by default for Version > 3.0.
|
||||||
(Robert Muir, Uwe Schindler, Simon Willnauer)
|
(Robert Muir, Uwe Schindler, Simon Willnauer)
|
||||||
|
|
||||||
Bug fixes
|
Bug fixes
|
||||||
|
|
||||||
* LUCENE-2855: contrib queryparser was using CharSequence as key in some internal
|
* LUCENE-2855: contrib queryparser was using CharSequence as key in some internal
|
||||||
Map instances, which was leading to incorrect behaviour, since some CharSequence
|
Map instances, which was leading to incorrect behavior, since some CharSequence
|
||||||
implementors do not override hashcode and equals methods. Now the internal Maps
|
implementors do not override hashcode and equals methods. Now the internal Maps
|
||||||
are using String instead. (Adriano Crestani)
|
are using String instead. (Adriano Crestani)
|
||||||
|
|
||||||
* LUCENE-2068: Fixed ReverseStringFilter which was not aware of supplementary
|
* LUCENE-2068: Fixed ReverseStringFilter which was not aware of supplementary
|
||||||
|
@ -106,9 +106,9 @@ Bug fixes
|
||||||
now reverses supplementary characters correctly if used with Version > 3.0.
|
now reverses supplementary characters correctly if used with Version > 3.0.
|
||||||
(Simon Willnauer, Robert Muir)
|
(Simon Willnauer, Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2035: TokenSources.getTokenStream() does not assign positionIncrement.
|
* LUCENE-2035: TokenSources.getTokenStream() does not assign positionIncrement.
|
||||||
(Christopher Morris via Mark Miller)
|
(Christopher Morris via Mark Miller)
|
||||||
|
|
||||||
* LUCENE-2055: Deprecated RussianTokenizer, RussianStemmer, RussianStemFilter,
|
* LUCENE-2055: Deprecated RussianTokenizer, RussianStemmer, RussianStemFilter,
|
||||||
FrenchStemmer, FrenchStemFilter, DutchStemmer, and DutchStemFilter. For
|
FrenchStemmer, FrenchStemFilter, DutchStemmer, and DutchStemFilter. For
|
||||||
these Analyzers, SnowballFilter is used instead (for Version > 3.0), as
|
these Analyzers, SnowballFilter is used instead (for Version > 3.0), as
|
||||||
|
@ -118,7 +118,7 @@ Bug fixes
|
||||||
|
|
||||||
* LUCENE-2184: Fixed bug with handling best fit value when the proper best fit value is
|
* LUCENE-2184: Fixed bug with handling best fit value when the proper best fit value is
|
||||||
not an indexed field. Note, this change affects the APIs. (Grant Ingersoll)
|
not an indexed field. Note, this change affects the APIs. (Grant Ingersoll)
|
||||||
|
|
||||||
* LUCENE-2359: Fix bug in CartesianPolyFilterBuilder related to handling of behavior around
|
* LUCENE-2359: Fix bug in CartesianPolyFilterBuilder related to handling of behavior around
|
||||||
the 180th meridian (Grant Ingersoll)
|
the 180th meridian (Grant Ingersoll)
|
||||||
|
|
||||||
|
@ -135,15 +135,15 @@ Bug fixes
|
||||||
and regenerating a new .nrm with 'ant gennorm2'. (David Bowen via Robert Muir)
|
and regenerating a new .nrm with 'ant gennorm2'. (David Bowen via Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2653: ThaiWordFilter depends on the JRE having a Thai dictionary, which is not
|
* LUCENE-2653: ThaiWordFilter depends on the JRE having a Thai dictionary, which is not
|
||||||
always the case. If the dictionary is unavailable, the filter will now throw
|
always the case. If the dictionary is unavailable, the filter will now throw
|
||||||
UnsupportedOperationException in the constructor. (Robert Muir)
|
UnsupportedOperationException in the constructor. (Robert Muir)
|
||||||
|
|
||||||
* LUCENE-589: Fix contrib/demo for international documents.
|
* LUCENE-589: Fix contrib/demo for international documents.
|
||||||
(Curtis d'Entremont via Robert Muir)
|
(Curtis d'Entremont via Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2246: Fix contrib/demo for Turkish html documents.
|
* LUCENE-2246: Fix contrib/demo for Turkish html documents.
|
||||||
(Selim Nadi via Robert Muir)
|
(Selim Nadi via Robert Muir)
|
||||||
|
|
||||||
* LUCENE-590: Demo HTML parser gives incorrect summaries when title is repeated as a heading
|
* LUCENE-590: Demo HTML parser gives incorrect summaries when title is repeated as a heading
|
||||||
(Curtis d'Entremont via Robert Muir)
|
(Curtis d'Entremont via Robert Muir)
|
||||||
|
|
||||||
|
@ -153,9 +153,9 @@ Bug fixes
|
||||||
* LUCENE-2874: Highlighting overlapping tokens outputted doubled words.
|
* LUCENE-2874: Highlighting overlapping tokens outputted doubled words.
|
||||||
(Pierre Gossé via Robert Muir)
|
(Pierre Gossé via Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2943: Fix thread-safety issues with ICUCollationKeyFilter.
|
* LUCENE-2943: Fix thread-safety issues with ICUCollationKeyFilter.
|
||||||
(Robert Muir)
|
(Robert Muir)
|
||||||
|
|
||||||
API Changes
|
API Changes
|
||||||
|
|
||||||
* LUCENE-2867: Some contrib queryparser methods that receives CharSequence as
|
* LUCENE-2867: Some contrib queryparser methods that receives CharSequence as
|
||||||
|
@ -165,7 +165,7 @@ API Changes
|
||||||
* LUCENE-2147: Spatial GeoHashUtils now always decode GeoHash strings
|
* LUCENE-2147: Spatial GeoHashUtils now always decode GeoHash strings
|
||||||
with full precision. GeoHash#decode_exactly(String) was merged into
|
with full precision. GeoHash#decode_exactly(String) was merged into
|
||||||
GeoHash#decode(String). (Chris Male, Simon Willnauer)
|
GeoHash#decode(String). (Chris Male, Simon Willnauer)
|
||||||
|
|
||||||
* LUCENE-2204: Change some package private classes/members to publicly accessible to implement
|
* LUCENE-2204: Change some package private classes/members to publicly accessible to implement
|
||||||
custom FragmentsBuilders. (Koji Sekiguchi)
|
custom FragmentsBuilders. (Koji Sekiguchi)
|
||||||
|
|
||||||
|
@ -182,14 +182,14 @@ API Changes
|
||||||
* LUCENE-2626: FastVectorHighlighter: enable FragListBuilder and FragmentsBuilder
|
* LUCENE-2626: FastVectorHighlighter: enable FragListBuilder and FragmentsBuilder
|
||||||
to be set per-field override. (Koji Sekiguchi)
|
to be set per-field override. (Koji Sekiguchi)
|
||||||
|
|
||||||
* LUCENE-2712: FieldBoostMapAttribute in contrib/queryparser was changed from
|
* LUCENE-2712: FieldBoostMapAttribute in contrib/queryparser was changed from
|
||||||
a Map<CharSequence,Float> to a Map<String,Float>. Per the CharSequence javadoc,
|
a Map<CharSequence,Float> to a Map<String,Float>. Per the CharSequence javadoc,
|
||||||
CharSequence is inappropriate as a map key. (Robert Muir)
|
CharSequence is inappropriate as a map key. (Robert Muir)
|
||||||
|
|
||||||
* LUCENE-1937: Add more methods to manipulate QueryNodeProcessorPipeline elements.
|
* LUCENE-1937: Add more methods to manipulate QueryNodeProcessorPipeline elements.
|
||||||
QueryNodeProcessorPipeline now implements the List interface, this is useful
|
QueryNodeProcessorPipeline now implements the List interface, this is useful
|
||||||
if you want to extend or modify an existing pipeline. (Adriano Crestani via Robert Muir)
|
if you want to extend or modify an existing pipeline. (Adriano Crestani via Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2754, LUCENE-2757: Deprecated SpanRegexQuery. Use
|
* LUCENE-2754, LUCENE-2757: Deprecated SpanRegexQuery. Use
|
||||||
new SpanMultiTermQueryWrapper<RegexQuery>(new RegexQuery()) instead.
|
new SpanMultiTermQueryWrapper<RegexQuery>(new RegexQuery()) instead.
|
||||||
(Robert Muir, Uwe Schindler)
|
(Robert Muir, Uwe Schindler)
|
||||||
|
@ -199,10 +199,10 @@ API Changes
|
||||||
|
|
||||||
* LUCENE-2830: Use StringBuilder instead of StringBuffer across Benchmark, and
|
* LUCENE-2830: Use StringBuilder instead of StringBuffer across Benchmark, and
|
||||||
remove the StringBuffer HtmlParser.parse() variant. (Shai Erera)
|
remove the StringBuffer HtmlParser.parse() variant. (Shai Erera)
|
||||||
|
|
||||||
* LUCENE-2920: Deprecated ShingleMatrixFilter as it is unmaintained and does
|
* LUCENE-2920: Deprecated ShingleMatrixFilter as it is unmaintained and does
|
||||||
not work with custom Attributes or custom payload encoders. (Uwe Schindler)
|
not work with custom Attributes or custom payload encoders. (Uwe Schindler)
|
||||||
|
|
||||||
New features
|
New features
|
||||||
|
|
||||||
* LUCENE-2500: Added DirectIOLinuxDirectory, a Linux-specific
|
* LUCENE-2500: Added DirectIOLinuxDirectory, a Linux-specific
|
||||||
|
@ -210,14 +210,14 @@ New features
|
||||||
cache. This is useful to prevent segment merging from evicting
|
cache. This is useful to prevent segment merging from evicting
|
||||||
pages from the buffer cache, since fadvise/madvise do not seem.
|
pages from the buffer cache, since fadvise/madvise do not seem.
|
||||||
(Michael McCandless)
|
(Michael McCandless)
|
||||||
|
|
||||||
* LUCENE-2306: Add NumericRangeFilter and NumericRangeQuery support to XMLQueryParser.
|
* LUCENE-2306: Add NumericRangeFilter and NumericRangeQuery support to XMLQueryParser.
|
||||||
(Jingkei Ly, via Mark Harwood)
|
(Jingkei Ly, via Mark Harwood)
|
||||||
|
|
||||||
* LUCENE-2102: Add a Turkish LowerCase Filter. TurkishLowerCaseFilter handles
|
* LUCENE-2102: Add a Turkish LowerCase Filter. TurkishLowerCaseFilter handles
|
||||||
Turkish and Azeri unique casing behavior correctly.
|
Turkish and Azeri unique casing behavior correctly.
|
||||||
(Ahmet Arslan, Robert Muir via Simon Willnauer)
|
(Ahmet Arslan, Robert Muir via Simon Willnauer)
|
||||||
|
|
||||||
* LUCENE-2039: Add a extensible query parser to contrib/misc.
|
* LUCENE-2039: Add a extensible query parser to contrib/misc.
|
||||||
ExtendableQueryParser enables arbitrary parser extensions based on a
|
ExtendableQueryParser enables arbitrary parser extensions based on a
|
||||||
customizable field naming scheme.
|
customizable field naming scheme.
|
||||||
|
@ -225,11 +225,11 @@ New features
|
||||||
|
|
||||||
* LUCENE-2067: Add a Czech light stemmer. CzechAnalyzer will now stem words
|
* LUCENE-2067: Add a Czech light stemmer. CzechAnalyzer will now stem words
|
||||||
when Version is set to 3.1 or higher. (Robert Muir)
|
when Version is set to 3.1 or higher. (Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2062: Add a Bulgarian analyzer. (Robert Muir, Simon Willnauer)
|
* LUCENE-2062: Add a Bulgarian analyzer. (Robert Muir, Simon Willnauer)
|
||||||
|
|
||||||
* LUCENE-2206: Add Snowball's stopword lists for Danish, Dutch, English,
|
* LUCENE-2206: Add Snowball's stopword lists for Danish, Dutch, English,
|
||||||
Finnish, French, German, Hungarian, Italian, Norwegian, Russian, Spanish,
|
Finnish, French, German, Hungarian, Italian, Norwegian, Russian, Spanish,
|
||||||
and Swedish. These can be loaded with WordListLoader.getSnowballWordSet.
|
and Swedish. These can be loaded with WordListLoader.getSnowballWordSet.
|
||||||
(Robert Muir, Simon Willnauer)
|
(Robert Muir, Simon Willnauer)
|
||||||
|
|
||||||
|
@ -237,7 +237,7 @@ New features
|
||||||
(Koji Sekiguchi)
|
(Koji Sekiguchi)
|
||||||
|
|
||||||
* LUCENE-2218: ShingleFilter supports minimum shingle size, and the separator
|
* LUCENE-2218: ShingleFilter supports minimum shingle size, and the separator
|
||||||
character is now configurable. Its also up to 20% faster.
|
character is now configurable. Its also up to 20% faster.
|
||||||
(Steven Rowe via Robert Muir)
|
(Steven Rowe via Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2234: Add a Hindi analyzer. (Robert Muir)
|
* LUCENE-2234: Add a Hindi analyzer. (Robert Muir)
|
||||||
|
@ -267,7 +267,7 @@ New features
|
||||||
* LUCENE-2298: Add analyzers/stempel, an algorithmic stemmer with support for
|
* LUCENE-2298: Add analyzers/stempel, an algorithmic stemmer with support for
|
||||||
the Polish language. (Andrzej Bialecki via Robert Muir)
|
the Polish language. (Andrzej Bialecki via Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2400: ShingleFilter was changed to don't output all-filler shingles and
|
* LUCENE-2400: ShingleFilter was changed to don't output all-filler shingles and
|
||||||
unigrams, and uses a more performant algorithm to build grams using a linked list
|
unigrams, and uses a more performant algorithm to build grams using a linked list
|
||||||
of AttributeSource.cloneAttributes() instances and the new copyTo() method.
|
of AttributeSource.cloneAttributes() instances and the new copyTo() method.
|
||||||
(Steven Rowe via Uwe Schindler)
|
(Steven Rowe via Uwe Schindler)
|
||||||
|
@ -286,7 +286,7 @@ New features
|
||||||
* LUCENE-2464: FastVectorHighlighter: add SingleFragListBuilder to return
|
* LUCENE-2464: FastVectorHighlighter: add SingleFragListBuilder to return
|
||||||
entire field contents. (Koji Sekiguchi)
|
entire field contents. (Koji Sekiguchi)
|
||||||
|
|
||||||
* LUCENE-2503: Added lighter stemming alternatives for European languages.
|
* LUCENE-2503: Added lighter stemming alternatives for European languages.
|
||||||
(Robert Muir)
|
(Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2581: FastVectorHighlighter: add Encoder to FragmentsBuilder.
|
* LUCENE-2581: FastVectorHighlighter: add Encoder to FragmentsBuilder.
|
||||||
|
@ -294,20 +294,23 @@ New features
|
||||||
|
|
||||||
* LUCENE-2624: Add Analyzers for Armenian, Basque, and Catalan, from snowball.
|
* LUCENE-2624: Add Analyzers for Armenian, Basque, and Catalan, from snowball.
|
||||||
(Robert Muir)
|
(Robert Muir)
|
||||||
|
|
||||||
* LUCENE-1938: PrecedenceQueryParser is now implemented with the flexible QP framework.
|
* LUCENE-1938: PrecedenceQueryParser is now implemented with the flexible QP framework.
|
||||||
This means that you can also add this functionality to your own QP pipeline by using
|
This means that you can also add this functionality to your own QP pipeline by using
|
||||||
BooleanModifiersQueryNodeProcessor, for example instead of GroupQueryNodeProcessor.
|
BooleanModifiersQueryNodeProcessor, for example instead of GroupQueryNodeProcessor.
|
||||||
(Adriano Crestani via Robert Muir)
|
(Adriano Crestani via Robert Muir)
|
||||||
|
|
||||||
* LUCENE-2791: Added WindowsDirectory, a Windows-specific Directory impl
|
* LUCENE-2791: Added WindowsDirectory, a Windows-specific Directory impl
|
||||||
that doesn't synchronize on the file handle. This can be useful to
|
that doesn't synchronize on the file handle. This can be useful to
|
||||||
avoid the performance problems of SimpleFSDirectory and NIOFSDirectory.
|
avoid the performance problems of SimpleFSDirectory and NIOFSDirectory.
|
||||||
(Robert Muir, Simon Willnauer, Uwe Schindler, Michael McCandless)
|
(Robert Muir, Simon Willnauer, Uwe Schindler, Michael McCandless)
|
||||||
|
|
||||||
* LUCENE-2842: Add analyzer for Galician. Also adds the RSLP (Orengo) stemmer
|
* LUCENE-2842: Add analyzer for Galician. Also adds the RSLP (Orengo) stemmer
|
||||||
for Portuguese. (Robert Muir)
|
for Portuguese. (Robert Muir)
|
||||||
|
|
||||||
|
* SOLR-1057: Add PathHierarchyTokenizer that represents file path hierarchies as synonyms of
|
||||||
|
/something, /something/something, /something/something/else. (Ryan McKinley, Koji Sekiguchi)
|
||||||
|
|
||||||
Build
|
Build
|
||||||
|
|
||||||
* LUCENE-2124: Moved the JDK-based collation support from contrib/collation
|
* LUCENE-2124: Moved the JDK-based collation support from contrib/collation
|
||||||
|
|
|
@ -247,24 +247,26 @@ Documentation
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
|
|
||||||
================== 3.1.0-dev ==================
|
================== 3.1.0 ==================
|
||||||
Versions of Major Components
|
Versions of Major Components
|
||||||
---------------------
|
---------------------
|
||||||
Apache Lucene trunk
|
Apache Lucene 3.1.0
|
||||||
Apache Tika 0.8
|
Apache Tika 0.8
|
||||||
Carrot2 3.4.2
|
Carrot2 3.4.2
|
||||||
|
Velocity 1.6.1 and Velocity Tools 2.0-beta3
|
||||||
|
Apache UIMA 2.3.1-SNAPSHOT
|
||||||
|
|
||||||
|
|
||||||
Upgrading from Solr 1.4
|
Upgrading from Solr 1.4
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
* The Lucene index format has changed and as a result, once you upgrade,
|
* The Lucene index format has changed and as a result, once you upgrade,
|
||||||
previous versions of Solr will no longer be able to read your indices.
|
previous versions of Solr will no longer be able to read your indices.
|
||||||
In a master/slave configuration, all searchers/slaves should be upgraded
|
In a master/slave configuration, all searchers/slaves should be upgraded
|
||||||
before the master. If the master were to be updated first, the older
|
before the master. If the master were to be updated first, the older
|
||||||
searchers would not be able to read the new index format.
|
searchers would not be able to read the new index format.
|
||||||
|
|
||||||
* The Solr JavaBin format has changed as of Solr 3.1. If you are using the
|
* The Solr JavaBin format has changed as of Solr 3.1. If you are using the
|
||||||
JavaBin format, you will need to upgrade your SolrJ client. (SOLR-2034)
|
JavaBin format, you will need to upgrade your SolrJ client. (SOLR-2034)
|
||||||
|
|
||||||
* The experimental ALIAS command has been removed (SOLR-1637)
|
* The experimental ALIAS command has been removed (SOLR-1637)
|
||||||
|
@ -275,10 +277,10 @@ Upgrading from Solr 1.4
|
||||||
is deprecated (SOLR-1696)
|
is deprecated (SOLR-1696)
|
||||||
|
|
||||||
* The deprecated HTMLStripReader, HTMLStripWhitespaceTokenizerFactory and
|
* The deprecated HTMLStripReader, HTMLStripWhitespaceTokenizerFactory and
|
||||||
HTMLStripStandardTokenizerFactory were removed. To strip HTML tags,
|
HTMLStripStandardTokenizerFactory were removed. To strip HTML tags,
|
||||||
HTMLStripCharFilter should be used instead, and it works with any
|
HTMLStripCharFilter should be used instead, and it works with any
|
||||||
Tokenizer of your choice. (SOLR-1657)
|
Tokenizer of your choice. (SOLR-1657)
|
||||||
|
|
||||||
* Field compression is no longer supported. Fields that were formerly
|
* Field compression is no longer supported. Fields that were formerly
|
||||||
compressed will be uncompressed as index segments are merged. For
|
compressed will be uncompressed as index segments are merged. For
|
||||||
shorter fields, this may actually be an improvement, as the compression
|
shorter fields, this may actually be an improvement, as the compression
|
||||||
|
@ -287,24 +289,24 @@ Upgrading from Solr 1.4
|
||||||
* SOLR-1845: The TermsComponent response format was changed so that the
|
* SOLR-1845: The TermsComponent response format was changed so that the
|
||||||
"terms" container is a map instead of a named list. This affects
|
"terms" container is a map instead of a named list. This affects
|
||||||
response formats like JSON, but not XML. (yonik)
|
response formats like JSON, but not XML. (yonik)
|
||||||
|
|
||||||
* SOLR-1876: All Analyzers and TokenStreams are now final to enforce
|
* SOLR-1876: All Analyzers and TokenStreams are now final to enforce
|
||||||
the decorator pattern. (rmuir, uschindler)
|
the decorator pattern. (rmuir, uschindler)
|
||||||
|
|
||||||
* LUCENE-2608: Added the ability to specify the accuracy on a per request basis.
|
* LUCENE-2608: Added the ability to specify the accuracy on a per request basis.
|
||||||
It is recommended that implementations of SolrSpellChecker should change over to the new SolrSpellChecker
|
It is recommended that implementations of SolrSpellChecker should change over to the new SolrSpellChecker
|
||||||
methods using the new SpellingOptions class, but are not required to. While this change is
|
methods using the new SpellingOptions class, but are not required to. While this change is
|
||||||
backward compatible, the trunk version of Solr has already dropped support for all but the SpellingOptions method. (gsingers)
|
backward compatible, the trunk version of Solr has already dropped support for all but the SpellingOptions method. (gsingers)
|
||||||
|
|
||||||
* readercycle script was removed. (SOLR-2046)
|
* readercycle script was removed. (SOLR-2046)
|
||||||
|
|
||||||
* In previous releases, sorting or evaluating function queries on
|
* In previous releases, sorting or evaluating function queries on
|
||||||
fields that were "multiValued" (either by explicit declaration in
|
fields that were "multiValued" (either by explicit declaration in
|
||||||
schema.xml or by implict behavior because the "version" attribute on
|
schema.xml or by implict behavior because the "version" attribute on
|
||||||
the schema was less then 1.2) did not generally work, but it would
|
the schema was less then 1.2) did not generally work, but it would
|
||||||
sometimes silently act as if it succeeded and order the docs
|
sometimes silently act as if it succeeded and order the docs
|
||||||
arbitrarily. Solr will now fail on any attempt to sort, or apply a
|
arbitrarily. Solr will now fail on any attempt to sort, or apply a
|
||||||
function to, multi-valued fields
|
function to, multi-valued fields
|
||||||
|
|
||||||
* The DataImportHandler jars are no longer included in the solr
|
* The DataImportHandler jars are no longer included in the solr
|
||||||
WAR and should be added in Solr's lib directory, or referenced
|
WAR and should be added in Solr's lib directory, or referenced
|
||||||
|
@ -374,13 +376,13 @@ New Features
|
||||||
* SOLR-1379: Add RAMDirectoryFactory for non-persistent in memory index storage.
|
* SOLR-1379: Add RAMDirectoryFactory for non-persistent in memory index storage.
|
||||||
(Alex Baranov via yonik)
|
(Alex Baranov via yonik)
|
||||||
|
|
||||||
* SOLR-1857: Synced Solr analysis with Lucene 3.1. Added KeywordMarkerFilterFactory
|
* SOLR-1857: Synced Solr analysis with Lucene 3.1. Added KeywordMarkerFilterFactory
|
||||||
and StemmerOverrideFilterFactory, which can be used to tune stemming algorithms.
|
and StemmerOverrideFilterFactory, which can be used to tune stemming algorithms.
|
||||||
Added factories for Bulgarian, Czech, Hindi, Turkish, and Wikipedia analysis. Improved the
|
Added factories for Bulgarian, Czech, Hindi, Turkish, and Wikipedia analysis. Improved the
|
||||||
performance of SnowballPorterFilterFactory. (rmuir)
|
performance of SnowballPorterFilterFactory. (rmuir)
|
||||||
|
|
||||||
* SOLR-1657: Converted remaining TokenStreams to the Attributes-based API. All Solr
|
* SOLR-1657: Converted remaining TokenStreams to the Attributes-based API. All Solr
|
||||||
TokenFilters now support custom Attributes, and some have improved performance:
|
TokenFilters now support custom Attributes, and some have improved performance:
|
||||||
especially WordDelimiterFilter and CommonGramsFilter. (rmuir, cmale, uschindler)
|
especially WordDelimiterFilter and CommonGramsFilter. (rmuir, cmale, uschindler)
|
||||||
|
|
||||||
* SOLR-1740: ShingleFilterFactory supports the "minShingleSize" and "tokenSeparator"
|
* SOLR-1740: ShingleFilterFactory supports the "minShingleSize" and "tokenSeparator"
|
||||||
|
@ -389,10 +391,10 @@ New Features
|
||||||
|
|
||||||
* SOLR-744: ShingleFilterFactory supports the "outputUnigramsIfNoShingles"
|
* SOLR-744: ShingleFilterFactory supports the "outputUnigramsIfNoShingles"
|
||||||
parameter, to output unigrams if the number of input tokens is fewer than
|
parameter, to output unigrams if the number of input tokens is fewer than
|
||||||
minShingleSize, and no shingles can be generated.
|
minShingleSize, and no shingles can be generated.
|
||||||
(Chris Harris via Steven Rowe)
|
(Chris Harris via Steven Rowe)
|
||||||
|
|
||||||
* SOLR-1923: PhoneticFilterFactory now has support for the
|
* SOLR-1923: PhoneticFilterFactory now has support for the
|
||||||
Caverphone algorithm. (rmuir)
|
Caverphone algorithm. (rmuir)
|
||||||
|
|
||||||
* SOLR-1957: The VelocityResponseWriter contrib moved to core.
|
* SOLR-1957: The VelocityResponseWriter contrib moved to core.
|
||||||
|
@ -460,7 +462,7 @@ New Features
|
||||||
(Ankul Garg, Jason Rutherglen, Shalin Shekhar Mangar, Grant Ingersoll, Robert Muir, ab)
|
(Ankul Garg, Jason Rutherglen, Shalin Shekhar Mangar, Grant Ingersoll, Robert Muir, ab)
|
||||||
|
|
||||||
* SOLR-1568: Added "native" filtering support for PointType, GeohashField. Added LatLonType with filtering support too. See
|
* SOLR-1568: Added "native" filtering support for PointType, GeohashField. Added LatLonType with filtering support too. See
|
||||||
http://wiki.apache.org/solr/SpatialSearch and the example. Refactored some items in Lucene spatial.
|
http://wiki.apache.org/solr/SpatialSearch and the example. Refactored some items in Lucene spatial.
|
||||||
Removed SpatialTileField as the underlying CartesianTier is broken beyond repair and is going to be moved. (gsingers)
|
Removed SpatialTileField as the underlying CartesianTier is broken beyond repair and is going to be moved. (gsingers)
|
||||||
|
|
||||||
* SOLR-2128: Full parameter substitution for function queries.
|
* SOLR-2128: Full parameter substitution for function queries.
|
||||||
|
@ -515,7 +517,7 @@ Optimizations
|
||||||
|
|
||||||
Bug Fixes
|
Bug Fixes
|
||||||
----------------------
|
----------------------
|
||||||
* SOLR-1769: Solr 1.4 Replication - Repeater throwing NullPointerException (Jörgen Rydenius via noble)
|
* SOLR-1769: Solr 1.4 Replication - Repeater throwing NullPointerException (Jörgen Rydenius via noble)
|
||||||
|
|
||||||
* SOLR-1432: Make the new ValueSource.getValues(context,reader) delegate
|
* SOLR-1432: Make the new ValueSource.getValues(context,reader) delegate
|
||||||
to the original ValueSource.getValues(reader) so custom sources
|
to the original ValueSource.getValues(reader) so custom sources
|
||||||
|
@ -538,8 +540,8 @@ Bug Fixes
|
||||||
* SOLR-1584: SolrJ - SolrQuery.setIncludeScore() incorrectly added
|
* SOLR-1584: SolrJ - SolrQuery.setIncludeScore() incorrectly added
|
||||||
fl=score to the parameter list instead of appending score to the
|
fl=score to the parameter list instead of appending score to the
|
||||||
existing field list. (yonik)
|
existing field list. (yonik)
|
||||||
|
|
||||||
* SOLR-1580: Solr Configuration ignores 'mergeFactor' parameter, always
|
* SOLR-1580: Solr Configuration ignores 'mergeFactor' parameter, always
|
||||||
uses Lucene default. (Lance Norskog via Mark Miller)
|
uses Lucene default. (Lance Norskog via Mark Miller)
|
||||||
|
|
||||||
* SOLR-1593: ReverseWildcardFilter didn't work for surrogate pairs
|
* SOLR-1593: ReverseWildcardFilter didn't work for surrogate pairs
|
||||||
|
@ -556,7 +558,7 @@ Bug Fixes
|
||||||
set when streaming updates, rather than using UTF-8 as the HTTP headers
|
set when streaming updates, rather than using UTF-8 as the HTTP headers
|
||||||
indicated, leading to an encoding mismatch. (hossman, yonik)
|
indicated, leading to an encoding mismatch. (hossman, yonik)
|
||||||
|
|
||||||
* SOLR-1587: A distributed search request with fl=score, didn't match
|
* SOLR-1587: A distributed search request with fl=score, didn't match
|
||||||
the behavior of a non-distributed request since it only returned
|
the behavior of a non-distributed request since it only returned
|
||||||
the id,score fields instead of all fields in addition to score. (yonik)
|
the id,score fields instead of all fields in addition to score. (yonik)
|
||||||
|
|
||||||
|
@ -565,7 +567,7 @@ Bug Fixes
|
||||||
* SOLR-1615: Backslash escaping did not work in quoted strings
|
* SOLR-1615: Backslash escaping did not work in quoted strings
|
||||||
for local param arguments. (Wojtek Piaseczny, yonik)
|
for local param arguments. (Wojtek Piaseczny, yonik)
|
||||||
|
|
||||||
* SOLR-1628: log contains incorrect number of adds and deletes.
|
* SOLR-1628: log contains incorrect number of adds and deletes.
|
||||||
(Thijs Vonk via yonik)
|
(Thijs Vonk via yonik)
|
||||||
|
|
||||||
* SOLR-343: Date faceting now respects facet.mincount limiting
|
* SOLR-343: Date faceting now respects facet.mincount limiting
|
||||||
|
@ -593,7 +595,7 @@ Bug Fixes
|
||||||
(never officially released) introduced another hanging bug due to
|
(never officially released) introduced another hanging bug due to
|
||||||
connections not being released.
|
connections not being released.
|
||||||
(Attila Babo, Erik Hetzner, Johannes Tuchscherer via yonik)
|
(Attila Babo, Erik Hetzner, Johannes Tuchscherer via yonik)
|
||||||
|
|
||||||
* SOLR-1748, SOLR-1747, SOLR-1746, SOLR-1745, SOLR-1744: Streams and Readers
|
* SOLR-1748, SOLR-1747, SOLR-1746, SOLR-1745, SOLR-1744: Streams and Readers
|
||||||
retrieved from ContentStreams are not closed in various places, resulting
|
retrieved from ContentStreams are not closed in various places, resulting
|
||||||
in file descriptor leaks.
|
in file descriptor leaks.
|
||||||
|
@ -602,7 +604,7 @@ Bug Fixes
|
||||||
* SOLR-1753: StatsComponent throws NPE when getting statistics for facets in distributed search
|
* SOLR-1753: StatsComponent throws NPE when getting statistics for facets in distributed search
|
||||||
(Janne Majaranta via koji)
|
(Janne Majaranta via koji)
|
||||||
|
|
||||||
* SOLR-1736:In the slave , If 'mov'ing file does not succeed , copy the file (noble)
|
* SOLR-1736:In the slave , If 'mov'ing file does not succeed , copy the file (noble)
|
||||||
|
|
||||||
* SOLR-1579: Fixes to XML escaping in stats.jsp
|
* SOLR-1579: Fixes to XML escaping in stats.jsp
|
||||||
(David Bowen and hossman)
|
(David Bowen and hossman)
|
||||||
|
@ -656,7 +658,7 @@ Bug Fixes
|
||||||
|
|
||||||
* SOLR-2047: ReplicationHandler should accept bool type for enable flag. (koji)
|
* SOLR-2047: ReplicationHandler should accept bool type for enable flag. (koji)
|
||||||
|
|
||||||
* SOLR-1630: Fix spell checking collation issue related to token positions (rmuir, gsingers)
|
* SOLR-1630: Fix spell checking collation issue related to token positions (rmuir, gsingers)
|
||||||
|
|
||||||
* SOLR-2100: The replication handler backup command didn't save the commit
|
* SOLR-2100: The replication handler backup command didn't save the commit
|
||||||
point and hence could fail when a newer commit caused the older commit point
|
point and hence could fail when a newer commit caused the older commit point
|
||||||
|
@ -665,7 +667,7 @@ Bug Fixes
|
||||||
|
|
||||||
* SOLR-2114: Fixed parsing error in hsin function. The function signature has changed slightly. (gsingers)
|
* SOLR-2114: Fixed parsing error in hsin function. The function signature has changed slightly. (gsingers)
|
||||||
|
|
||||||
* SOLR-2083: SpellCheckComponent misreports suggestions when distributed (James Dyer via gsingers)
|
* SOLR-2083: SpellCheckComponent misreports suggestions when distributed (James Dyer via gsingers)
|
||||||
|
|
||||||
* SOLR-2111: Change exception handling in distributed faceting to work more
|
* SOLR-2111: Change exception handling in distributed faceting to work more
|
||||||
like non-distributed faceting, change facet_counts/exception from a String
|
like non-distributed faceting, change facet_counts/exception from a String
|
||||||
|
@ -689,9 +691,9 @@ Bug Fixes
|
||||||
* SOLR-2173: Suggester should always rebuild Lookup data if Lookup.load fails. (ab)
|
* SOLR-2173: Suggester should always rebuild Lookup data if Lookup.load fails. (ab)
|
||||||
|
|
||||||
* SOLR-2081: BaseResponseWriter.isStreamingDocs causes
|
* SOLR-2081: BaseResponseWriter.isStreamingDocs causes
|
||||||
SingleResponseWriter.end to be called 2x
|
SingleResponseWriter.end to be called 2x
|
||||||
(Chris A. Mattmann via hossman)
|
(Chris A. Mattmann via hossman)
|
||||||
|
|
||||||
* SOLR-2219: The init() method of every SolrRequestHandler was being
|
* SOLR-2219: The init() method of every SolrRequestHandler was being
|
||||||
called twice. (ambikeshwar singh and hossman)
|
called twice. (ambikeshwar singh and hossman)
|
||||||
|
|
||||||
|
@ -716,7 +718,7 @@ Bug Fixes
|
||||||
|
|
||||||
* SOLR-482: Provide more exception handling in CSVLoader (gsingers)
|
* SOLR-482: Provide more exception handling in CSVLoader (gsingers)
|
||||||
|
|
||||||
* SOLR-1283: HTMLStripCharFilter sometimes threw a "Mark Invalid" exception.
|
* SOLR-1283: HTMLStripCharFilter sometimes threw a "Mark Invalid" exception.
|
||||||
(Julien Coloos, hossman, yonik)
|
(Julien Coloos, hossman, yonik)
|
||||||
|
|
||||||
* SOLR-2085: Improve SolrJ behavior when FacetComponent comes before
|
* SOLR-2085: Improve SolrJ behavior when FacetComponent comes before
|
||||||
|
@ -743,21 +745,29 @@ Bug Fixes
|
||||||
|
|
||||||
* SOLR-2380: Distributed faceting could miss values when facet.sort=index
|
* SOLR-2380: Distributed faceting could miss values when facet.sort=index
|
||||||
and when facet.offset was greater than 0. (yonik)
|
and when facet.offset was greater than 0. (yonik)
|
||||||
|
|
||||||
* SOLR-1656: XIncludes and other HREFs in XML files loaded by ResourceLoader
|
* SOLR-1656: XIncludes and other HREFs in XML files loaded by ResourceLoader
|
||||||
are fixed to be resolved using the URI standard (RFC 2396). The system
|
are fixed to be resolved using the URI standard (RFC 2396). The system
|
||||||
identifier is no longer a plain filename with path, it gets initialized
|
identifier is no longer a plain filename with path, it gets initialized
|
||||||
using a custom URI scheme "solrres:". This scheme is resolved using a
|
using a custom URI scheme "solrres:". This scheme is resolved using a
|
||||||
EntityResolver that utilizes ResourceLoader
|
EntityResolver that utilizes ResourceLoader
|
||||||
(org.apache.solr.common.util.SystemIdResolver). This makes all relative
|
(org.apache.solr.common.util.SystemIdResolver). This makes all relative
|
||||||
pathes in Solr's config files behave like expected. This change
|
pathes in Solr's config files behave like expected. This change
|
||||||
introduces some backwards breaks in the API: Some config classes
|
introduces some backwards breaks in the API: Some config classes
|
||||||
(Config, SolrConfig, IndexSchema) were changed to take
|
(Config, SolrConfig, IndexSchema) were changed to take
|
||||||
org.xml.sax.InputSource instead of InputStream. There may also be some
|
org.xml.sax.InputSource instead of InputStream. There may also be some
|
||||||
backwards breaks in existing config files, it is recommended to check
|
backwards breaks in existing config files, it is recommended to check
|
||||||
your config files / XSLTs and replace all XIncludes/HREFs that were
|
your config files / XSLTs and replace all XIncludes/HREFs that were
|
||||||
hacked to use absolute paths to use relative ones. (uschindler)
|
hacked to use absolute paths to use relative ones. (uschindler)
|
||||||
|
|
||||||
|
* SOLR-309: Fix FieldType so setting an analyzer on a FieldType that
|
||||||
|
doesn't expect it will generate an error. Practically speaking this
|
||||||
|
means that Solr will now correctly generate an error on
|
||||||
|
initialization if the schema.xml contains an analyzer configuration
|
||||||
|
for a fieldType that does not use TextField. (hossman)
|
||||||
|
|
||||||
|
* SOLR-2192: StreamingUpdateSolrServer.blockUntilFinished was not
|
||||||
|
thread safe and could throw an exception. (yonik)
|
||||||
|
|
||||||
Other Changes
|
Other Changes
|
||||||
----------------------
|
----------------------
|
||||||
|
|
Loading…
Reference in New Issue