mirror of https://github.com/apache/lucene.git
sync CHANGEs for 3.1
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1087056 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
a4c7a88834
commit
9fdc41f0f8
|
@ -393,7 +393,7 @@ Optimizations
|
|||
* LUCENE-2990: ArrayUtil/CollectionUtil.*Sort() methods now exit early
|
||||
on empty or one-element lists/arrays. (Uwe Schindler)
|
||||
|
||||
======================= Lucene 3.1 (not yet released) =======================
|
||||
======================= Lucene 3.1.0 =======================
|
||||
|
||||
Changes in backwards compatibility policy
|
||||
|
||||
|
@ -409,7 +409,7 @@ Changes in backwards compatibility policy
|
|||
|
||||
* LUCENE-2190: Removed deprecated customScore() and customExplain()
|
||||
methods from experimental CustomScoreQuery. (Uwe Schindler)
|
||||
|
||||
|
||||
* LUCENE-2286: Enabled DefaultSimilarity.setDiscountOverlaps by default.
|
||||
This means that terms with a position increment gap of zero do not
|
||||
affect the norms calculation by default. (Robert Muir)
|
||||
|
@ -447,10 +447,10 @@ Changes in backwards compatibility policy
|
|||
actual file's length if the file exists, and throws FileNotFoundException
|
||||
otherwise. Returning length=0 for a non-existent file is no longer allowed. If
|
||||
you relied on that, make sure to catch the exception. (Shai Erera)
|
||||
|
||||
|
||||
* LUCENE-2386: IndexWriter no longer performs an empty commit upon new index
|
||||
creation. Previously, if you passed an empty Directory and set OpenMode to
|
||||
CREATE*, IndexWriter would make a first empty commit. If you need that
|
||||
CREATE*, IndexWriter would make a first empty commit. If you need that
|
||||
behavior you can call writer.commit()/close() immediately after you create it.
|
||||
(Shai Erera, Mike McCandless)
|
||||
|
||||
|
@ -466,10 +466,10 @@ Changes in backwards compatibility policy
|
|||
values in multi-valued field has been changed for some cases in index.
|
||||
If you index empty fields and uses positions/offsets information on that
|
||||
fields, reindex is recommended. (David Smiley, Koji Sekiguchi)
|
||||
|
||||
|
||||
* LUCENE-2804: Directory.setLockFactory new declares throwing an IOException.
|
||||
(Shai Erera, Robert Muir)
|
||||
|
||||
|
||||
* LUCENE-2837: Added deprecations noting that in 4.0, Searcher and
|
||||
Searchable are collapsed into IndexSearcher; contrib/remote and
|
||||
MultiSearcher have been removed. (Mike McCandless)
|
||||
|
@ -496,7 +496,7 @@ Changes in runtime behavior
|
|||
* LUCENE-2179: CharArraySet.clear() is now functional.
|
||||
(Robert Muir, Uwe Schindler)
|
||||
|
||||
* LUCENE-2455: IndexWriter.addIndexes no longer optimizes the target index
|
||||
* LUCENE-2455: IndexWriter.addIndexes no longer optimizes the target index
|
||||
before it adds the new ones. Also, the existing segments are not merged and so
|
||||
the index will not end up with a single segment (unless it was empty before).
|
||||
In addition, addIndexesNoOptimize was renamed to addIndexes and no longer
|
||||
|
@ -515,9 +515,9 @@ Changes in runtime behavior
|
|||
usage, allowing applications to accidentally open two writers on the
|
||||
same directory. (Mike McCandless)
|
||||
|
||||
* LUCENE-2701: maxMergeMBForOptimize and maxMergeDocs constraints set on
|
||||
LogMergePolicy now affect optimize() as well (as opposed to only regular
|
||||
merges). This means that you can run optimize() and too large segments won't
|
||||
* LUCENE-2701: maxMergeMBForOptimize and maxMergeDocs constraints set on
|
||||
LogMergePolicy now affect optimize() as well (as opposed to only regular
|
||||
merges). This means that you can run optimize() and too large segments won't
|
||||
be merged. (Shai Erera)
|
||||
|
||||
* LUCENE-2753: IndexReader and DirectoryReader .listCommits() now return a List,
|
||||
|
@ -527,9 +527,9 @@ Changes in runtime behavior
|
|||
the IndexSearcher search methods that take an int nDocs will now
|
||||
throw IllegalArgumentException if nDocs is 0. Instead, you should
|
||||
use the newly added TotalHitCountCollector. (Mike McCandless)
|
||||
|
||||
* LUCENE-2790: LogMergePolicy.useCompoundFile's logic now factors in noCFSRatio
|
||||
to determine whether the passed in segment should be compound.
|
||||
|
||||
* LUCENE-2790: LogMergePolicy.useCompoundFile's logic now factors in noCFSRatio
|
||||
to determine whether the passed in segment should be compound.
|
||||
(Shai Erera, Earwin Burrfoot)
|
||||
|
||||
* LUCENE-2805: IndexWriter now increments the index version on every change to
|
||||
|
@ -549,7 +549,7 @@ Changes in runtime behavior
|
|||
|
||||
* LUCENE-2010: Segments with 100% deleted documents are now removed on
|
||||
IndexReader or IndexWriter commit. (Uwe Schindler, Mike McCandless)
|
||||
|
||||
|
||||
* LUCENE-2960: Allow some changes to IndexWriterConfig to take effect
|
||||
"live" (after an IW is instantiated), via
|
||||
IndexWriter.getConfig().setXXX(...) (Shay Banon, Mike McCandless)
|
||||
|
@ -567,7 +567,7 @@ API Changes
|
|||
|
||||
* LUCENE-2103: NoLockFactory should have a private constructor;
|
||||
until Lucene 4.0 the default one will be deprecated.
|
||||
(Shai Erera via Uwe Schindler)
|
||||
(Shai Erera via Uwe Schindler)
|
||||
|
||||
* LUCENE-2177: Deprecate the Field ctors that take byte[] and Store.
|
||||
Since the removal of compressed fields, Store can only be YES, so
|
||||
|
@ -587,30 +587,30 @@ API Changes
|
|||
files are no longer open by IndexReaders. (luocanrao via Mike
|
||||
McCandless)
|
||||
|
||||
* LUCENE-2282: IndexFileNames is exposed as a public class allowing for easier
|
||||
use by external code. In addition it offers a matchExtension method which
|
||||
* LUCENE-2282: IndexFileNames is exposed as a public class allowing for easier
|
||||
use by external code. In addition it offers a matchExtension method which
|
||||
callers can use to query whether a certain file matches a certain extension.
|
||||
(Shai Erera via Mike McCandless)
|
||||
(Shai Erera via Mike McCandless)
|
||||
|
||||
* LUCENE-124: Add a TopTermsBoostOnlyBooleanQueryRewrite to MultiTermQuery.
|
||||
This rewrite method is similar to TopTermsScoringBooleanQueryRewrite, but
|
||||
only scores terms by their boost values. For example, this can be used
|
||||
with FuzzyQuery to ensure that exact matches are always scored higher,
|
||||
only scores terms by their boost values. For example, this can be used
|
||||
with FuzzyQuery to ensure that exact matches are always scored higher,
|
||||
because only the boost will be used in scoring. (Robert Muir)
|
||||
|
||||
* LUCENE-2015: Add a static method foldToASCII to ASCIIFoldingFilter to
|
||||
|
||||
* LUCENE-2015: Add a static method foldToASCII to ASCIIFoldingFilter to
|
||||
expose its folding logic. (Cédrik Lime via Robert Muir)
|
||||
|
||||
* LUCENE-2294: IndexWriter constructors have been deprecated in favor of a
|
||||
|
||||
* LUCENE-2294: IndexWriter constructors have been deprecated in favor of a
|
||||
single ctor which accepts IndexWriterConfig and a Directory. You can set all
|
||||
the parameters related to IndexWriter on IndexWriterConfig. The different
|
||||
setter/getter methods were deprecated as well. One should call
|
||||
the parameters related to IndexWriter on IndexWriterConfig. The different
|
||||
setter/getter methods were deprecated as well. One should call
|
||||
writer.getConfig().getXYZ() to query for a parameter XYZ.
|
||||
Additionally, the setter/getter related to MergePolicy were deprecated as
|
||||
Additionally, the setter/getter related to MergePolicy were deprecated as
|
||||
well. One should interact with the MergePolicy directly.
|
||||
(Shai Erera via Mike McCandless)
|
||||
|
||||
* LUCENE-2320: IndexWriter's MergePolicy configuration was moved to
|
||||
|
||||
* LUCENE-2320: IndexWriter's MergePolicy configuration was moved to
|
||||
IndexWriterConfig and the respective methods on IndexWriter were deprecated.
|
||||
(Shai Erera via Mike McCandless)
|
||||
|
||||
|
@ -634,14 +634,14 @@ API Changes
|
|||
* LUCENE-2402: IndexWriter.deleteUnusedFiles now deletes unreferenced commit
|
||||
points too. If you use an IndexDeletionPolicy which holds onto index commits
|
||||
(such as SnapshotDeletionPolicy), you can call this method to remove those
|
||||
commit points when they are not needed anymore (instead of waiting for the
|
||||
commit points when they are not needed anymore (instead of waiting for the
|
||||
next commit). (Shai Erera)
|
||||
|
||||
|
||||
* LUCENE-2481: SnapshotDeletionPolicy.snapshot() and release() were replaced
|
||||
with equivalent ones that take a String (id) as argument. You can pass
|
||||
whatever ID you want, as long as you use the same one when calling both.
|
||||
whatever ID you want, as long as you use the same one when calling both.
|
||||
(Shai Erera)
|
||||
|
||||
|
||||
* LUCENE-2356: Add IndexWriterConfig.set/getReaderTermIndexDivisor, to
|
||||
set what IndexWriter passes for termsIndexDivisor to the readers it
|
||||
opens internally when apply deletions or creating a near-real-time
|
||||
|
@ -651,7 +651,7 @@ API Changes
|
|||
in common/standard/ now implement the Word Break rules from the Unicode 6.0.0
|
||||
Text Segmentation algorithm (UAX#29), covering the full range of Unicode code
|
||||
points, including values from U+FFFF to U+10FFFF
|
||||
|
||||
|
||||
ClassicTokenizer/Analyzer retains the old (pre-Lucene 3.1) StandardTokenizer/
|
||||
Analyzer implementation and behavior. Only the Unicode Basic Multilingual
|
||||
Plane (code points from U+0000 to U+FFFF) is covered.
|
||||
|
@ -659,16 +659,16 @@ API Changes
|
|||
UAX29URLEmailTokenizer tokenizes URLs and E-mail addresses according to the
|
||||
relevant RFCs, in addition to implementing the UAX#29 Word Break rules.
|
||||
(Steven Rowe, Robert Muir, Uwe Schindler)
|
||||
|
||||
|
||||
* LUCENE-2778: RAMDirectory now exposes newRAMFile() which allows to override
|
||||
and return a different RAMFile implementation. (Shai Erera)
|
||||
|
||||
|
||||
* LUCENE-2785: Added TotalHitCountCollector whose sole purpose is to
|
||||
count the number of hits matching the query. (Mike McCandless)
|
||||
|
||||
* LUCENE-2846: Deprecated IndexReader.setNorm(int, String, float). This method
|
||||
is only syntactic sugar for setNorm(int, String, byte), but using the global
|
||||
Similarity.getDefault().encodeNormValue(). Use the byte-based method instead
|
||||
* LUCENE-2846: Deprecated IndexReader.setNorm(int, String, float). This method
|
||||
is only syntactic sugar for setNorm(int, String, byte), but using the global
|
||||
Similarity.getDefault().encodeNormValue(). Use the byte-based method instead
|
||||
to ensure that the norm is encoded with your Similarity.
|
||||
(Robert Muir, Mike McCandless)
|
||||
|
||||
|
@ -689,6 +689,9 @@ API Changes
|
|||
for AttributeImpls, but can still be provided (if needed).
|
||||
(Uwe Schindler)
|
||||
|
||||
* LUCENE-2691: Deprecate IndexWriter.getReader in favor of
|
||||
IndexReader.open(IndexWriter) (Grant Ingersoll, Mike McCandless)
|
||||
|
||||
* LUCENE-2876: Deprecated Scorer.getSimilarity(). If your Scorer uses a Similarity,
|
||||
it should keep it itself. Fixed Scorers to pass their parent Weight, so that
|
||||
Scorer.visitSubScorers (LUCENE-2590) will work correctly.
|
||||
|
@ -700,7 +703,7 @@ API Changes
|
|||
expert use cases can handle seeing deleted documents returned. The
|
||||
deletes remain buffered so that the next time you open an NRT reader
|
||||
and pass true, all deletes will be a applied. (Mike McCandless)
|
||||
|
||||
|
||||
* LUCENE-1253: LengthFilter (and Solr's KeepWordTokenFilter) now
|
||||
require up front specification of enablePositionIncrement. Together with
|
||||
StopFilter they have a common base class (FilteringTokenFilter) that handles
|
||||
|
@ -711,7 +714,7 @@ Bug fixes
|
|||
|
||||
* LUCENE-2249: ParallelMultiSearcher should shut down thread pool on
|
||||
close. (Martin Traverso via Uwe Schindler)
|
||||
|
||||
|
||||
* LUCENE-2273: FieldCacheImpl.getCacheEntries() used WeakHashMap
|
||||
incorrectly and lead to ConcurrentModificationException.
|
||||
(Uwe Schindler, Robert Muir)
|
||||
|
@ -722,7 +725,7 @@ Bug fixes
|
|||
|
||||
* LUCENE-2074: Reduce buffer size of lexer back to default on reset.
|
||||
(Ruben Laguna, Shai Erera via Uwe Schindler)
|
||||
|
||||
|
||||
* LUCENE-2496: Don't throw NPE if IndexWriter is opened with CREATE on
|
||||
a prior (corrupt) index missing its segments_N file. (Mike
|
||||
McCandless)
|
||||
|
@ -731,10 +734,10 @@ Bug fixes
|
|||
assuming whitespace tokenization. Previously all CJK queries, for example,
|
||||
would be turned into phrase queries. The old behavior is preserved with
|
||||
the matchVersion parameter for previous versions. Additionally, you can
|
||||
explicitly enable the old behavior with setAutoGeneratePhraseQueries(true)
|
||||
explicitly enable the old behavior with setAutoGeneratePhraseQueries(true)
|
||||
(Robert Muir)
|
||||
|
||||
* LUCENE-2537: FSDirectory.copy() implementation was unsafe and could result in
|
||||
|
||||
* LUCENE-2537: FSDirectory.copy() implementation was unsafe and could result in
|
||||
OOM if a large file was copied. (Shai Erera)
|
||||
|
||||
* LUCENE-2580: MultiPhraseQuery throws AIOOBE if number of positions
|
||||
|
@ -752,14 +755,14 @@ Bug fixes
|
|||
|
||||
* LUCENE-2802: NRT DirectoryReader returned incorrect values from
|
||||
getVersion, isOptimized, getCommitUserData, getIndexCommit and isCurrent due
|
||||
to a mutable reference to the IndexWriters SegmentInfos.
|
||||
to a mutable reference to the IndexWriters SegmentInfos.
|
||||
(Simon Willnauer, Earwin Burrfoot)
|
||||
|
||||
* LUCENE-2852: Fixed corner case in RAMInputStream that would hit a
|
||||
false EOF after seeking to EOF then seeking back to same block you
|
||||
were just in and then calling readBytes (Robert Muir, Mike McCandless)
|
||||
|
||||
* LUCENE-2860: Fixed SegmentInfo.sizeInBytes to factor includeDocStores when it
|
||||
* LUCENE-2860: Fixed SegmentInfo.sizeInBytes to factor includeDocStores when it
|
||||
decides whether to return the cached computed size or not. (Shai Erera)
|
||||
|
||||
* LUCENE-2584: SegmentInfo.files() could hit ConcurrentModificationException if
|
||||
|
@ -772,7 +775,7 @@ Bug fixes
|
|||
internally, it now calls Similarity.idfExplain(Collection, IndexSearcher).
|
||||
(Robert Muir)
|
||||
|
||||
* LUCENE-2693: RAM used by IndexWriter was slightly incorrectly computed.
|
||||
* LUCENE-2693: RAM used by IndexWriter was slightly incorrectly computed.
|
||||
(Jason Rutherglen via Shai Erera)
|
||||
|
||||
* LUCENE-1846: DateTools now uses the US locale everywhere, so DateTools.round()
|
||||
|
@ -788,6 +791,9 @@ Bug fixes
|
|||
been rounded down to 0 instead of being rounded up to the smallest
|
||||
positive number. (yonik)
|
||||
|
||||
* LUCENE-2936: PhraseQuery score explanations were not correctly
|
||||
identifying matches vs non-matches. (hossman)
|
||||
|
||||
* LUCENE-2975: A hotspot bug corrupts IndexInput#readVInt()/readVLong() if
|
||||
the underlying readByte() is inlined (which happens e.g. in MMapDirectory).
|
||||
The loop was unwinded which makes the hotspot bug disappear.
|
||||
|
@ -796,30 +802,30 @@ Bug fixes
|
|||
New features
|
||||
|
||||
* LUCENE-2128: Parallelized fetching document frequencies during weight
|
||||
creation. (Israel Tsadok, Simon Willnauer via Uwe Schindler)
|
||||
creation. (Israel Tsadok, Simon Willnauer via Uwe Schindler)
|
||||
|
||||
* LUCENE-2069: Added Unicode 4 support to CharArraySet. Due to the switch
|
||||
to Java 5, supplementary characters are now lowercased correctly if the
|
||||
set is created as case insensitive.
|
||||
CharArraySet now requires a Version argument to preserve
|
||||
backwards compatibility. If Version < 3.1 is passed to the constructor,
|
||||
CharArraySet now requires a Version argument to preserve
|
||||
backwards compatibility. If Version < 3.1 is passed to the constructor,
|
||||
CharArraySet yields the old behavior. (Simon Willnauer)
|
||||
|
||||
|
||||
* LUCENE-2069: Added Unicode 4 support to LowerCaseFilter. Due to the switch
|
||||
to Java 5, supplementary characters are now lowercased correctly.
|
||||
LowerCaseFilter now requires a Version argument to preserve
|
||||
backwards compatibility. If Version < 3.1 is passed to the constructor,
|
||||
LowerCaseFilter yields the old behavior. (Simon Willnauer, Robert Muir)
|
||||
LowerCaseFilter now requires a Version argument to preserve
|
||||
backwards compatibility. If Version < 3.1 is passed to the constructor,
|
||||
LowerCaseFilter yields the old behavior. (Simon Willnauer, Robert Muir)
|
||||
|
||||
* LUCENE-2034: Added ReusableAnalyzerBase, an abstract subclass of Analyzer
|
||||
that makes it easier to reuse TokenStreams correctly. This issue also added
|
||||
StopwordAnalyzerBase, which improves consistency of all Analyzers that use
|
||||
stopwords, and implement many analyzers in contrib with it.
|
||||
stopwords, and implement many analyzers in contrib with it.
|
||||
(Simon Willnauer via Robert Muir)
|
||||
|
||||
|
||||
* LUCENE-2198, LUCENE-2901: Support protected words in stemming TokenFilters using a
|
||||
new KeywordAttribute. (Simon Willnauer, Drew Farris via Uwe Schindler)
|
||||
|
||||
|
||||
* LUCENE-2183, LUCENE-2240, LUCENE-2241: Added Unicode 4 support
|
||||
to CharTokenizer and its subclasses. CharTokenizer now has new
|
||||
int-API which is conditionally preferred to the old char-API depending
|
||||
|
@ -828,8 +834,8 @@ New features
|
|||
|
||||
* LUCENE-2247: Added a CharArrayMap<V> for performance improvements
|
||||
in some stemmers and synonym filters. (Uwe Schindler)
|
||||
|
||||
* LUCENE-2320: Added SetOnce which wraps an object and allows it to be set
|
||||
|
||||
* LUCENE-2320: Added SetOnce which wraps an object and allows it to be set
|
||||
exactly once. (Shai Erera via Mike McCandless)
|
||||
|
||||
* LUCENE-2314: Added AttributeSource.copyTo(AttributeSource) that
|
||||
|
@ -856,19 +862,19 @@ New features
|
|||
Directory.copyTo, and use nio's FileChannel.transferTo when copying
|
||||
files between FSDirectory instances. (Earwin Burrfoot via Mike
|
||||
McCandless).
|
||||
|
||||
|
||||
* LUCENE-2074: Make StandardTokenizer fit for Unicode 4.0, if the
|
||||
matchVersion parameter is Version.LUCENE_31. (Uwe Schindler)
|
||||
|
||||
* LUCENE-2385: Moved NoDeletionPolicy from benchmark to core. NoDeletionPolicy
|
||||
can be used to prevent commits from ever getting deleted from the index.
|
||||
(Shai Erera)
|
||||
|
||||
* LUCENE-1585: IndexWriter now accepts a PayloadProcessorProvider which can
|
||||
return a DirPayloadProcessor for a given Directory, which returns a
|
||||
PayloadProcessor for a given Term. The PayloadProcessor will be used to
|
||||
|
||||
* LUCENE-1585: IndexWriter now accepts a PayloadProcessorProvider which can
|
||||
return a DirPayloadProcessor for a given Directory, which returns a
|
||||
PayloadProcessor for a given Term. The PayloadProcessor will be used to
|
||||
process the payloads of the segments as they are merged (e.g. if one wants to
|
||||
rewrite payloads of external indexes as they are added, or of local ones).
|
||||
rewrite payloads of external indexes as they are added, or of local ones).
|
||||
(Shai Erera, Michael Busch, Mike McCandless)
|
||||
|
||||
* LUCENE-2440: Add support for custom ExecutorService in
|
||||
|
@ -881,7 +887,7 @@ New features
|
|||
|
||||
* LUCENE-2526: Don't throw NPE from MultiPhraseQuery.toString when
|
||||
it's empty. (Ross Woolf via Mike McCandless)
|
||||
|
||||
|
||||
* LUCENE-2559: Added SegmentReader.reopen methods (John Wang via Mike
|
||||
McCandless)
|
||||
|
||||
|
@ -897,17 +903,20 @@ New features
|
|||
to add span support: SpanMultiTermQueryWrapper<Q extends MultiTermQuery>.
|
||||
Using this wrapper its easy to add fuzzy/wildcard to e.g. a SpanNearQuery.
|
||||
(Robert Muir, Uwe Schindler)
|
||||
|
||||
|
||||
* LUCENE-2838: ConstantScoreQuery now directly supports wrapping a Query
|
||||
instance for stripping off scores. The use of a QueryWrapperFilter
|
||||
is no longer needed and discouraged for that use case. Directly wrapping
|
||||
Query improves performance, as out-of-order collection is now supported.
|
||||
(Uwe Schindler)
|
||||
|
||||
* LUCENE-2864: Add getMaxTermFrequency (maximum within-document TF) to
|
||||
* LUCENE-2864: Add getMaxTermFrequency (maximum within-document TF) to
|
||||
FieldInvertState so that it can be used in Similarity.computeNorm.
|
||||
(Robert Muir)
|
||||
|
||||
* LUCENE-2720: Segments now record the code version which created them.
|
||||
(Shai Erera, Mike McCandless, Uwe Schindler)
|
||||
|
||||
* LUCENE-2474: Added expert ReaderFinishedListener API to
|
||||
IndexReader, to allow apps that maintain external per-segment caches
|
||||
to evict entries when a segment is finished. (Shay Banon, Yonik
|
||||
|
@ -916,8 +925,8 @@ New features
|
|||
* LUCENE-2911: The new StandardTokenizer, UAX29URLEmailTokenizer, and
|
||||
the ICUTokenizer in contrib now all tag types with a consistent set
|
||||
of token types (defined in StandardTokenizer). Tokens in the major
|
||||
CJK types are explicitly marked to allow for custom downstream handling:
|
||||
<IDEOGRAPHIC>, <HANGUL>, <KATAKANA>, and <HIRAGANA>.
|
||||
CJK types are explicitly marked to allow for custom downstream handling:
|
||||
<IDEOGRAPHIC>, <HANGUL>, <KATAKANA>, and <HIRAGANA>.
|
||||
(Robert Muir, Steven Rowe)
|
||||
|
||||
* LUCENE-2913: Add missing getters to Numeric* classes. (Uwe Schindler)
|
||||
|
@ -942,7 +951,7 @@ Optimizations
|
|||
* LUCENE-2137: Switch to AtomicInteger for some ref counting (Earwin
|
||||
Burrfoot via Mike McCandless)
|
||||
|
||||
* LUCENE-2123, LUCENE-2261: Move FuzzyQuery rewrite to separate RewriteMode
|
||||
* LUCENE-2123, LUCENE-2261: Move FuzzyQuery rewrite to separate RewriteMode
|
||||
into MultiTermQuery. The number of fuzzy expansions can be specified with
|
||||
the maxExpansions parameter to FuzzyQuery.
|
||||
(Uwe Schindler, Robert Muir, Mike McCandless)
|
||||
|
@ -976,12 +985,12 @@ Optimizations
|
|||
TermAttributeImpl, move DEFAULT_TYPE constant to TypeInterface, improve
|
||||
null-handling for TypeAttribute. (Uwe Schindler)
|
||||
|
||||
* LUCENE-2329: Switch TermsHash* from using a PostingList object per unique
|
||||
* LUCENE-2329: Switch TermsHash* from using a PostingList object per unique
|
||||
term to parallel arrays, indexed by termID. This reduces garbage collection
|
||||
overhead significantly, which results in great indexing performance wins
|
||||
when the available JVM heap space is low. This will become even more
|
||||
important when the DocumentsWriter RAM buffer is searchable in the future,
|
||||
because then it will make sense to make the RAM buffers as large as
|
||||
because then it will make sense to make the RAM buffers as large as
|
||||
possible. (Mike McCandless, Michael Busch)
|
||||
|
||||
* LUCENE-2380: The terms field cache methods (getTerms,
|
||||
|
@ -996,7 +1005,7 @@ Optimizations
|
|||
causing too many fallbacks to compare-by-value (instead of by-ord).
|
||||
(Mike McCandless)
|
||||
|
||||
* LUCENE-2574: IndexInput exposes copyBytes(IndexOutput, long) to allow for
|
||||
* LUCENE-2574: IndexInput exposes copyBytes(IndexOutput, long) to allow for
|
||||
efficient copying by sub-classes. Optimized copy is implemented for RAM and FS
|
||||
streams. (Shai Erera)
|
||||
|
||||
|
@ -1019,15 +1028,15 @@ Optimizations
|
|||
|
||||
* LUCENE-2010: Segments with 100% deleted documents are now removed on
|
||||
IndexReader or IndexWriter commit. (Uwe Schindler, Mike McCandless)
|
||||
|
||||
|
||||
* LUCENE-1472: Removed synchronization from static DateTools methods
|
||||
by using a ThreadLocal. Also converted DateTools.Resolution to a
|
||||
Java 5 enum (this should not break backwards). (Uwe Schindler)
|
||||
|
||||
Build
|
||||
|
||||
* LUCENE-2124: Moved the JDK-based collation support from contrib/collation
|
||||
into core, and moved the ICU-based collation support into contrib/icu.
|
||||
* LUCENE-2124: Moved the JDK-based collation support from contrib/collation
|
||||
into core, and moved the ICU-based collation support into contrib/icu.
|
||||
(Robert Muir)
|
||||
|
||||
* LUCENE-2326: Removed SVN checkouts for backwards tests. The backwards
|
||||
|
@ -1039,14 +1048,14 @@ Build
|
|||
|
||||
* LUCENE-1709: Tests are now parallelized by default (except for benchmark). You
|
||||
can force them to run sequentially by passing -Drunsequential=1 on the command
|
||||
line. The number of threads that are spawned per CPU defaults to '1'. If you
|
||||
line. The number of threads that are spawned per CPU defaults to '1'. If you
|
||||
wish to change that, you can run the tests with -DthreadsPerProcessor=[num].
|
||||
(Robert Muir, Shai Erera, Peter Kofler)
|
||||
|
||||
* LUCENE-2516: Backwards tests are now compiled against released lucene-core.jar
|
||||
from tarball of previous version. Backwards tests are now packaged together
|
||||
with src distribution. (Uwe Schindler)
|
||||
|
||||
|
||||
* LUCENE-2611: Added Ant target to install IntelliJ IDEA configuration:
|
||||
"ant idea". See http://wiki.apache.org/lucene-java/HowtoConfigureIntelliJ
|
||||
(Steven Rowe)
|
||||
|
@ -1055,8 +1064,8 @@ Build
|
|||
generating Maven artifacts (Steven Rowe)
|
||||
|
||||
* LUCENE-2609: Added jar-test-framework Ant target which packages Lucene's
|
||||
tests' framework classes. (Drew Farris, Grant Ingersoll, Shai Erera, Steven
|
||||
Rowe)
|
||||
tests' framework classes. (Drew Farris, Grant Ingersoll, Shai Erera,
|
||||
Steven Rowe)
|
||||
|
||||
Test Cases
|
||||
|
||||
|
@ -1092,18 +1101,18 @@ Test Cases
|
|||
access to "real" files from the test folder itself, can use
|
||||
LuceneTestCase(J4).getDataFile(). (Uwe Schindler)
|
||||
|
||||
* LUCENE-2398, LUCENE-2611: Improve tests to work better from IDEs such
|
||||
* LUCENE-2398, LUCENE-2611: Improve tests to work better from IDEs such
|
||||
as Eclipse and IntelliJ.
|
||||
(Paolo Castagna, Steven Rowe via Robert Muir)
|
||||
|
||||
* LUCENE-2804: add newFSDirectory to LuceneTestCase to create a FSDirectory at
|
||||
random. (Shai Erera, Robert Muir)
|
||||
|
||||
|
||||
Documentation
|
||||
|
||||
* LUCENE-2579: Fix oal.search's package.html description of abstract
|
||||
methods. (Santiago M. Mola via Mike McCandless)
|
||||
|
||||
|
||||
* LUCENE-2625: Add a note to IndexReader.termDocs() with additional verbiage
|
||||
that the TermEnum must be seeked since it is unpositioned.
|
||||
(Adriano Crestani via Robert Muir)
|
||||
|
|
|
@ -47,26 +47,26 @@ API Changes
|
|||
|
||||
(No changes)
|
||||
|
||||
======================= Lucene 3.1 (not yet released) =======================
|
||||
======================= Lucene 3.1.0 =======================
|
||||
|
||||
Changes in backwards compatibility policy
|
||||
|
||||
* LUCENE-2100: All Analyzers in Lucene-contrib have been marked as final.
|
||||
Analyzers should be only act as a composition of TokenStreams, users should
|
||||
compose their own analyzers instead of subclassing existing ones.
|
||||
(Simon Willnauer)
|
||||
(Simon Willnauer)
|
||||
|
||||
* LUCENE-2194, LUCENE-2201: Snowball APIs were upgraded to snowball revision
|
||||
502 (with some local modifications for improved performance).
|
||||
Index backwards compatibility and binary backwards compatibility is
|
||||
preserved, but some protected/public member variables changed type. This
|
||||
does NOT affect java code/class files produced by the snowball compiler,
|
||||
502 (with some local modifications for improved performance).
|
||||
Index backwards compatibility and binary backwards compatibility is
|
||||
preserved, but some protected/public member variables changed type. This
|
||||
does NOT affect java code/class files produced by the snowball compiler,
|
||||
but technically is a backwards compatibility break. (Robert Muir)
|
||||
|
||||
|
||||
* LUCENE-2226: Moved contrib/snowball functionality into contrib/analyzers.
|
||||
Be sure to remove any old obselete lucene-snowball jar files from your
|
||||
classpath! (Robert Muir)
|
||||
|
||||
|
||||
* LUCENE-2323: Moved contrib/wikipedia functionality into contrib/analyzers.
|
||||
Additionally the package was changed from org.apache.lucene.wikipedia.analysis
|
||||
to org.apache.lucene.analysis.wikipedia. (Robert Muir)
|
||||
|
@ -74,30 +74,30 @@ Changes in backwards compatibility policy
|
|||
* LUCENE-2581: Added new methods to FragmentsBuilder interface. These methods
|
||||
are used to set pre/post tags and Encoder. (Koji Sekiguchi)
|
||||
|
||||
* LUCENE-2391: Improved spellchecker (re)build time/ram usage by omitting
|
||||
* LUCENE-2391: Improved spellchecker (re)build time/ram usage by omitting
|
||||
frequencies/positions/norms for single-valued fields, modifying the default
|
||||
ramBufferMBSize to match IndexWriterConfig (16MB), making index optimization
|
||||
an optional boolean parameter, and modifying the incremental update logic
|
||||
to work well with unoptimized spellcheck indexes. The indexDictionary() methods
|
||||
were made final to ensure a hard backwards break in case you were subclassing
|
||||
to work well with unoptimized spellcheck indexes. The indexDictionary() methods
|
||||
were made final to ensure a hard backwards break in case you were subclassing
|
||||
Spellchecker. In general, subclassing Spellchecker is not recommended. (Robert Muir)
|
||||
|
||||
|
||||
Changes in runtime behavior
|
||||
|
||||
* LUCENE-2117: SnowballAnalyzer uses TurkishLowerCaseFilter instead of
|
||||
LowercaseFilter to correctly handle the unique Turkish casing behavior if
|
||||
used with Version > 3.0 and the TurkishStemmer.
|
||||
(Robert Muir via Simon Willnauer)
|
||||
(Robert Muir via Simon Willnauer)
|
||||
|
||||
* LUCENE-2055: GermanAnalyzer now uses the Snowball German2 algorithm and
|
||||
* LUCENE-2055: GermanAnalyzer now uses the Snowball German2 algorithm and
|
||||
stopwords list by default for Version > 3.0.
|
||||
(Robert Muir, Uwe Schindler, Simon Willnauer)
|
||||
|
||||
Bug fixes
|
||||
|
||||
* LUCENE-2855: contrib queryparser was using CharSequence as key in some internal
|
||||
Map instances, which was leading to incorrect behaviour, since some CharSequence
|
||||
implementors do not override hashcode and equals methods. Now the internal Maps
|
||||
* LUCENE-2855: contrib queryparser was using CharSequence as key in some internal
|
||||
Map instances, which was leading to incorrect behavior, since some CharSequence
|
||||
implementors do not override hashcode and equals methods. Now the internal Maps
|
||||
are using String instead. (Adriano Crestani)
|
||||
|
||||
* LUCENE-2068: Fixed ReverseStringFilter which was not aware of supplementary
|
||||
|
@ -106,9 +106,9 @@ Bug fixes
|
|||
now reverses supplementary characters correctly if used with Version > 3.0.
|
||||
(Simon Willnauer, Robert Muir)
|
||||
|
||||
* LUCENE-2035: TokenSources.getTokenStream() does not assign positionIncrement.
|
||||
* LUCENE-2035: TokenSources.getTokenStream() does not assign positionIncrement.
|
||||
(Christopher Morris via Mark Miller)
|
||||
|
||||
|
||||
* LUCENE-2055: Deprecated RussianTokenizer, RussianStemmer, RussianStemFilter,
|
||||
FrenchStemmer, FrenchStemFilter, DutchStemmer, and DutchStemFilter. For
|
||||
these Analyzers, SnowballFilter is used instead (for Version > 3.0), as
|
||||
|
@ -118,7 +118,7 @@ Bug fixes
|
|||
|
||||
* LUCENE-2184: Fixed bug with handling best fit value when the proper best fit value is
|
||||
not an indexed field. Note, this change affects the APIs. (Grant Ingersoll)
|
||||
|
||||
|
||||
* LUCENE-2359: Fix bug in CartesianPolyFilterBuilder related to handling of behavior around
|
||||
the 180th meridian (Grant Ingersoll)
|
||||
|
||||
|
@ -135,15 +135,15 @@ Bug fixes
|
|||
and regenerating a new .nrm with 'ant gennorm2'. (David Bowen via Robert Muir)
|
||||
|
||||
* LUCENE-2653: ThaiWordFilter depends on the JRE having a Thai dictionary, which is not
|
||||
always the case. If the dictionary is unavailable, the filter will now throw
|
||||
always the case. If the dictionary is unavailable, the filter will now throw
|
||||
UnsupportedOperationException in the constructor. (Robert Muir)
|
||||
|
||||
* LUCENE-589: Fix contrib/demo for international documents.
|
||||
* LUCENE-589: Fix contrib/demo for international documents.
|
||||
(Curtis d'Entremont via Robert Muir)
|
||||
|
||||
|
||||
* LUCENE-2246: Fix contrib/demo for Turkish html documents.
|
||||
(Selim Nadi via Robert Muir)
|
||||
|
||||
(Selim Nadi via Robert Muir)
|
||||
|
||||
* LUCENE-590: Demo HTML parser gives incorrect summaries when title is repeated as a heading
|
||||
(Curtis d'Entremont via Robert Muir)
|
||||
|
||||
|
@ -153,9 +153,9 @@ Bug fixes
|
|||
* LUCENE-2874: Highlighting overlapping tokens outputted doubled words.
|
||||
(Pierre Gossé via Robert Muir)
|
||||
|
||||
* LUCENE-2943: Fix thread-safety issues with ICUCollationKeyFilter.
|
||||
* LUCENE-2943: Fix thread-safety issues with ICUCollationKeyFilter.
|
||||
(Robert Muir)
|
||||
|
||||
|
||||
API Changes
|
||||
|
||||
* LUCENE-2867: Some contrib queryparser methods that receives CharSequence as
|
||||
|
@ -165,7 +165,7 @@ API Changes
|
|||
* LUCENE-2147: Spatial GeoHashUtils now always decode GeoHash strings
|
||||
with full precision. GeoHash#decode_exactly(String) was merged into
|
||||
GeoHash#decode(String). (Chris Male, Simon Willnauer)
|
||||
|
||||
|
||||
* LUCENE-2204: Change some package private classes/members to publicly accessible to implement
|
||||
custom FragmentsBuilders. (Koji Sekiguchi)
|
||||
|
||||
|
@ -182,14 +182,14 @@ API Changes
|
|||
* LUCENE-2626: FastVectorHighlighter: enable FragListBuilder and FragmentsBuilder
|
||||
to be set per-field override. (Koji Sekiguchi)
|
||||
|
||||
* LUCENE-2712: FieldBoostMapAttribute in contrib/queryparser was changed from
|
||||
* LUCENE-2712: FieldBoostMapAttribute in contrib/queryparser was changed from
|
||||
a Map<CharSequence,Float> to a Map<String,Float>. Per the CharSequence javadoc,
|
||||
CharSequence is inappropriate as a map key. (Robert Muir)
|
||||
|
||||
* LUCENE-1937: Add more methods to manipulate QueryNodeProcessorPipeline elements.
|
||||
QueryNodeProcessorPipeline now implements the List interface, this is useful
|
||||
if you want to extend or modify an existing pipeline. (Adriano Crestani via Robert Muir)
|
||||
|
||||
|
||||
* LUCENE-2754, LUCENE-2757: Deprecated SpanRegexQuery. Use
|
||||
new SpanMultiTermQueryWrapper<RegexQuery>(new RegexQuery()) instead.
|
||||
(Robert Muir, Uwe Schindler)
|
||||
|
@ -199,10 +199,10 @@ API Changes
|
|||
|
||||
* LUCENE-2830: Use StringBuilder instead of StringBuffer across Benchmark, and
|
||||
remove the StringBuffer HtmlParser.parse() variant. (Shai Erera)
|
||||
|
||||
|
||||
* LUCENE-2920: Deprecated ShingleMatrixFilter as it is unmaintained and does
|
||||
not work with custom Attributes or custom payload encoders. (Uwe Schindler)
|
||||
|
||||
|
||||
New features
|
||||
|
||||
* LUCENE-2500: Added DirectIOLinuxDirectory, a Linux-specific
|
||||
|
@ -210,14 +210,14 @@ New features
|
|||
cache. This is useful to prevent segment merging from evicting
|
||||
pages from the buffer cache, since fadvise/madvise do not seem.
|
||||
(Michael McCandless)
|
||||
|
||||
|
||||
* LUCENE-2306: Add NumericRangeFilter and NumericRangeQuery support to XMLQueryParser.
|
||||
(Jingkei Ly, via Mark Harwood)
|
||||
|
||||
* LUCENE-2102: Add a Turkish LowerCase Filter. TurkishLowerCaseFilter handles
|
||||
Turkish and Azeri unique casing behavior correctly.
|
||||
(Ahmet Arslan, Robert Muir via Simon Willnauer)
|
||||
|
||||
|
||||
* LUCENE-2039: Add a extensible query parser to contrib/misc.
|
||||
ExtendableQueryParser enables arbitrary parser extensions based on a
|
||||
customizable field naming scheme.
|
||||
|
@ -225,11 +225,11 @@ New features
|
|||
|
||||
* LUCENE-2067: Add a Czech light stemmer. CzechAnalyzer will now stem words
|
||||
when Version is set to 3.1 or higher. (Robert Muir)
|
||||
|
||||
|
||||
* LUCENE-2062: Add a Bulgarian analyzer. (Robert Muir, Simon Willnauer)
|
||||
|
||||
* LUCENE-2206: Add Snowball's stopword lists for Danish, Dutch, English,
|
||||
Finnish, French, German, Hungarian, Italian, Norwegian, Russian, Spanish,
|
||||
Finnish, French, German, Hungarian, Italian, Norwegian, Russian, Spanish,
|
||||
and Swedish. These can be loaded with WordListLoader.getSnowballWordSet.
|
||||
(Robert Muir, Simon Willnauer)
|
||||
|
||||
|
@ -237,7 +237,7 @@ New features
|
|||
(Koji Sekiguchi)
|
||||
|
||||
* LUCENE-2218: ShingleFilter supports minimum shingle size, and the separator
|
||||
character is now configurable. Its also up to 20% faster.
|
||||
character is now configurable. Its also up to 20% faster.
|
||||
(Steven Rowe via Robert Muir)
|
||||
|
||||
* LUCENE-2234: Add a Hindi analyzer. (Robert Muir)
|
||||
|
@ -267,7 +267,7 @@ New features
|
|||
* LUCENE-2298: Add analyzers/stempel, an algorithmic stemmer with support for
|
||||
the Polish language. (Andrzej Bialecki via Robert Muir)
|
||||
|
||||
* LUCENE-2400: ShingleFilter was changed to don't output all-filler shingles and
|
||||
* LUCENE-2400: ShingleFilter was changed to don't output all-filler shingles and
|
||||
unigrams, and uses a more performant algorithm to build grams using a linked list
|
||||
of AttributeSource.cloneAttributes() instances and the new copyTo() method.
|
||||
(Steven Rowe via Uwe Schindler)
|
||||
|
@ -286,7 +286,7 @@ New features
|
|||
* LUCENE-2464: FastVectorHighlighter: add SingleFragListBuilder to return
|
||||
entire field contents. (Koji Sekiguchi)
|
||||
|
||||
* LUCENE-2503: Added lighter stemming alternatives for European languages.
|
||||
* LUCENE-2503: Added lighter stemming alternatives for European languages.
|
||||
(Robert Muir)
|
||||
|
||||
* LUCENE-2581: FastVectorHighlighter: add Encoder to FragmentsBuilder.
|
||||
|
@ -294,20 +294,23 @@ New features
|
|||
|
||||
* LUCENE-2624: Add Analyzers for Armenian, Basque, and Catalan, from snowball.
|
||||
(Robert Muir)
|
||||
|
||||
|
||||
* LUCENE-1938: PrecedenceQueryParser is now implemented with the flexible QP framework.
|
||||
This means that you can also add this functionality to your own QP pipeline by using
|
||||
BooleanModifiersQueryNodeProcessor, for example instead of GroupQueryNodeProcessor.
|
||||
(Adriano Crestani via Robert Muir)
|
||||
|
||||
* LUCENE-2791: Added WindowsDirectory, a Windows-specific Directory impl
|
||||
that doesn't synchronize on the file handle. This can be useful to
|
||||
that doesn't synchronize on the file handle. This can be useful to
|
||||
avoid the performance problems of SimpleFSDirectory and NIOFSDirectory.
|
||||
(Robert Muir, Simon Willnauer, Uwe Schindler, Michael McCandless)
|
||||
|
||||
* LUCENE-2842: Add analyzer for Galician. Also adds the RSLP (Orengo) stemmer
|
||||
for Portuguese. (Robert Muir)
|
||||
|
||||
* SOLR-1057: Add PathHierarchyTokenizer that represents file path hierarchies as synonyms of
|
||||
/something, /something/something, /something/something/else. (Ryan McKinley, Koji Sekiguchi)
|
||||
|
||||
Build
|
||||
|
||||
* LUCENE-2124: Moved the JDK-based collation support from contrib/collation
|
||||
|
|
|
@ -247,24 +247,26 @@ Documentation
|
|||
----------------------
|
||||
|
||||
|
||||
================== 3.1.0-dev ==================
|
||||
================== 3.1.0 ==================
|
||||
Versions of Major Components
|
||||
---------------------
|
||||
Apache Lucene trunk
|
||||
Apache Lucene 3.1.0
|
||||
Apache Tika 0.8
|
||||
Carrot2 3.4.2
|
||||
Velocity 1.6.1 and Velocity Tools 2.0-beta3
|
||||
Apache UIMA 2.3.1-SNAPSHOT
|
||||
|
||||
|
||||
Upgrading from Solr 1.4
|
||||
----------------------
|
||||
|
||||
* The Lucene index format has changed and as a result, once you upgrade,
|
||||
* The Lucene index format has changed and as a result, once you upgrade,
|
||||
previous versions of Solr will no longer be able to read your indices.
|
||||
In a master/slave configuration, all searchers/slaves should be upgraded
|
||||
before the master. If the master were to be updated first, the older
|
||||
searchers would not be able to read the new index format.
|
||||
|
||||
* The Solr JavaBin format has changed as of Solr 3.1. If you are using the
|
||||
* The Solr JavaBin format has changed as of Solr 3.1. If you are using the
|
||||
JavaBin format, you will need to upgrade your SolrJ client. (SOLR-2034)
|
||||
|
||||
* The experimental ALIAS command has been removed (SOLR-1637)
|
||||
|
@ -275,10 +277,10 @@ Upgrading from Solr 1.4
|
|||
is deprecated (SOLR-1696)
|
||||
|
||||
* The deprecated HTMLStripReader, HTMLStripWhitespaceTokenizerFactory and
|
||||
HTMLStripStandardTokenizerFactory were removed. To strip HTML tags,
|
||||
HTMLStripCharFilter should be used instead, and it works with any
|
||||
HTMLStripStandardTokenizerFactory were removed. To strip HTML tags,
|
||||
HTMLStripCharFilter should be used instead, and it works with any
|
||||
Tokenizer of your choice. (SOLR-1657)
|
||||
|
||||
|
||||
* Field compression is no longer supported. Fields that were formerly
|
||||
compressed will be uncompressed as index segments are merged. For
|
||||
shorter fields, this may actually be an improvement, as the compression
|
||||
|
@ -287,24 +289,24 @@ Upgrading from Solr 1.4
|
|||
* SOLR-1845: The TermsComponent response format was changed so that the
|
||||
"terms" container is a map instead of a named list. This affects
|
||||
response formats like JSON, but not XML. (yonik)
|
||||
|
||||
|
||||
* SOLR-1876: All Analyzers and TokenStreams are now final to enforce
|
||||
the decorator pattern. (rmuir, uschindler)
|
||||
|
||||
* LUCENE-2608: Added the ability to specify the accuracy on a per request basis.
|
||||
* LUCENE-2608: Added the ability to specify the accuracy on a per request basis.
|
||||
It is recommended that implementations of SolrSpellChecker should change over to the new SolrSpellChecker
|
||||
methods using the new SpellingOptions class, but are not required to. While this change is
|
||||
backward compatible, the trunk version of Solr has already dropped support for all but the SpellingOptions method. (gsingers)
|
||||
|
||||
* readercycle script was removed. (SOLR-2046)
|
||||
|
||||
* In previous releases, sorting or evaluating function queries on
|
||||
* In previous releases, sorting or evaluating function queries on
|
||||
fields that were "multiValued" (either by explicit declaration in
|
||||
schema.xml or by implict behavior because the "version" attribute on
|
||||
the schema was less then 1.2) did not generally work, but it would
|
||||
sometimes silently act as if it succeeded and order the docs
|
||||
arbitrarily. Solr will now fail on any attempt to sort, or apply a
|
||||
function to, multi-valued fields
|
||||
function to, multi-valued fields
|
||||
|
||||
* The DataImportHandler jars are no longer included in the solr
|
||||
WAR and should be added in Solr's lib directory, or referenced
|
||||
|
@ -374,13 +376,13 @@ New Features
|
|||
* SOLR-1379: Add RAMDirectoryFactory for non-persistent in memory index storage.
|
||||
(Alex Baranov via yonik)
|
||||
|
||||
* SOLR-1857: Synced Solr analysis with Lucene 3.1. Added KeywordMarkerFilterFactory
|
||||
and StemmerOverrideFilterFactory, which can be used to tune stemming algorithms.
|
||||
* SOLR-1857: Synced Solr analysis with Lucene 3.1. Added KeywordMarkerFilterFactory
|
||||
and StemmerOverrideFilterFactory, which can be used to tune stemming algorithms.
|
||||
Added factories for Bulgarian, Czech, Hindi, Turkish, and Wikipedia analysis. Improved the
|
||||
performance of SnowballPorterFilterFactory. (rmuir)
|
||||
|
||||
* SOLR-1657: Converted remaining TokenStreams to the Attributes-based API. All Solr
|
||||
TokenFilters now support custom Attributes, and some have improved performance:
|
||||
* SOLR-1657: Converted remaining TokenStreams to the Attributes-based API. All Solr
|
||||
TokenFilters now support custom Attributes, and some have improved performance:
|
||||
especially WordDelimiterFilter and CommonGramsFilter. (rmuir, cmale, uschindler)
|
||||
|
||||
* SOLR-1740: ShingleFilterFactory supports the "minShingleSize" and "tokenSeparator"
|
||||
|
@ -389,10 +391,10 @@ New Features
|
|||
|
||||
* SOLR-744: ShingleFilterFactory supports the "outputUnigramsIfNoShingles"
|
||||
parameter, to output unigrams if the number of input tokens is fewer than
|
||||
minShingleSize, and no shingles can be generated.
|
||||
minShingleSize, and no shingles can be generated.
|
||||
(Chris Harris via Steven Rowe)
|
||||
|
||||
* SOLR-1923: PhoneticFilterFactory now has support for the
|
||||
* SOLR-1923: PhoneticFilterFactory now has support for the
|
||||
Caverphone algorithm. (rmuir)
|
||||
|
||||
* SOLR-1957: The VelocityResponseWriter contrib moved to core.
|
||||
|
@ -460,7 +462,7 @@ New Features
|
|||
(Ankul Garg, Jason Rutherglen, Shalin Shekhar Mangar, Grant Ingersoll, Robert Muir, ab)
|
||||
|
||||
* SOLR-1568: Added "native" filtering support for PointType, GeohashField. Added LatLonType with filtering support too. See
|
||||
http://wiki.apache.org/solr/SpatialSearch and the example. Refactored some items in Lucene spatial.
|
||||
http://wiki.apache.org/solr/SpatialSearch and the example. Refactored some items in Lucene spatial.
|
||||
Removed SpatialTileField as the underlying CartesianTier is broken beyond repair and is going to be moved. (gsingers)
|
||||
|
||||
* SOLR-2128: Full parameter substitution for function queries.
|
||||
|
@ -515,7 +517,7 @@ Optimizations
|
|||
|
||||
Bug Fixes
|
||||
----------------------
|
||||
* SOLR-1769: Solr 1.4 Replication - Repeater throwing NullPointerException (Jörgen Rydenius via noble)
|
||||
* SOLR-1769: Solr 1.4 Replication - Repeater throwing NullPointerException (Jörgen Rydenius via noble)
|
||||
|
||||
* SOLR-1432: Make the new ValueSource.getValues(context,reader) delegate
|
||||
to the original ValueSource.getValues(reader) so custom sources
|
||||
|
@ -538,8 +540,8 @@ Bug Fixes
|
|||
* SOLR-1584: SolrJ - SolrQuery.setIncludeScore() incorrectly added
|
||||
fl=score to the parameter list instead of appending score to the
|
||||
existing field list. (yonik)
|
||||
|
||||
* SOLR-1580: Solr Configuration ignores 'mergeFactor' parameter, always
|
||||
|
||||
* SOLR-1580: Solr Configuration ignores 'mergeFactor' parameter, always
|
||||
uses Lucene default. (Lance Norskog via Mark Miller)
|
||||
|
||||
* SOLR-1593: ReverseWildcardFilter didn't work for surrogate pairs
|
||||
|
@ -556,7 +558,7 @@ Bug Fixes
|
|||
set when streaming updates, rather than using UTF-8 as the HTTP headers
|
||||
indicated, leading to an encoding mismatch. (hossman, yonik)
|
||||
|
||||
* SOLR-1587: A distributed search request with fl=score, didn't match
|
||||
* SOLR-1587: A distributed search request with fl=score, didn't match
|
||||
the behavior of a non-distributed request since it only returned
|
||||
the id,score fields instead of all fields in addition to score. (yonik)
|
||||
|
||||
|
@ -565,7 +567,7 @@ Bug Fixes
|
|||
* SOLR-1615: Backslash escaping did not work in quoted strings
|
||||
for local param arguments. (Wojtek Piaseczny, yonik)
|
||||
|
||||
* SOLR-1628: log contains incorrect number of adds and deletes.
|
||||
* SOLR-1628: log contains incorrect number of adds and deletes.
|
||||
(Thijs Vonk via yonik)
|
||||
|
||||
* SOLR-343: Date faceting now respects facet.mincount limiting
|
||||
|
@ -593,7 +595,7 @@ Bug Fixes
|
|||
(never officially released) introduced another hanging bug due to
|
||||
connections not being released.
|
||||
(Attila Babo, Erik Hetzner, Johannes Tuchscherer via yonik)
|
||||
|
||||
|
||||
* SOLR-1748, SOLR-1747, SOLR-1746, SOLR-1745, SOLR-1744: Streams and Readers
|
||||
retrieved from ContentStreams are not closed in various places, resulting
|
||||
in file descriptor leaks.
|
||||
|
@ -602,7 +604,7 @@ Bug Fixes
|
|||
* SOLR-1753: StatsComponent throws NPE when getting statistics for facets in distributed search
|
||||
(Janne Majaranta via koji)
|
||||
|
||||
* SOLR-1736:In the slave , If 'mov'ing file does not succeed , copy the file (noble)
|
||||
* SOLR-1736:In the slave , If 'mov'ing file does not succeed , copy the file (noble)
|
||||
|
||||
* SOLR-1579: Fixes to XML escaping in stats.jsp
|
||||
(David Bowen and hossman)
|
||||
|
@ -656,7 +658,7 @@ Bug Fixes
|
|||
|
||||
* SOLR-2047: ReplicationHandler should accept bool type for enable flag. (koji)
|
||||
|
||||
* SOLR-1630: Fix spell checking collation issue related to token positions (rmuir, gsingers)
|
||||
* SOLR-1630: Fix spell checking collation issue related to token positions (rmuir, gsingers)
|
||||
|
||||
* SOLR-2100: The replication handler backup command didn't save the commit
|
||||
point and hence could fail when a newer commit caused the older commit point
|
||||
|
@ -665,7 +667,7 @@ Bug Fixes
|
|||
|
||||
* SOLR-2114: Fixed parsing error in hsin function. The function signature has changed slightly. (gsingers)
|
||||
|
||||
* SOLR-2083: SpellCheckComponent misreports suggestions when distributed (James Dyer via gsingers)
|
||||
* SOLR-2083: SpellCheckComponent misreports suggestions when distributed (James Dyer via gsingers)
|
||||
|
||||
* SOLR-2111: Change exception handling in distributed faceting to work more
|
||||
like non-distributed faceting, change facet_counts/exception from a String
|
||||
|
@ -689,9 +691,9 @@ Bug Fixes
|
|||
* SOLR-2173: Suggester should always rebuild Lookup data if Lookup.load fails. (ab)
|
||||
|
||||
* SOLR-2081: BaseResponseWriter.isStreamingDocs causes
|
||||
SingleResponseWriter.end to be called 2x
|
||||
(Chris A. Mattmann via hossman)
|
||||
|
||||
SingleResponseWriter.end to be called 2x
|
||||
(Chris A. Mattmann via hossman)
|
||||
|
||||
* SOLR-2219: The init() method of every SolrRequestHandler was being
|
||||
called twice. (ambikeshwar singh and hossman)
|
||||
|
||||
|
@ -716,7 +718,7 @@ Bug Fixes
|
|||
|
||||
* SOLR-482: Provide more exception handling in CSVLoader (gsingers)
|
||||
|
||||
* SOLR-1283: HTMLStripCharFilter sometimes threw a "Mark Invalid" exception.
|
||||
* SOLR-1283: HTMLStripCharFilter sometimes threw a "Mark Invalid" exception.
|
||||
(Julien Coloos, hossman, yonik)
|
||||
|
||||
* SOLR-2085: Improve SolrJ behavior when FacetComponent comes before
|
||||
|
@ -743,21 +745,29 @@ Bug Fixes
|
|||
|
||||
* SOLR-2380: Distributed faceting could miss values when facet.sort=index
|
||||
and when facet.offset was greater than 0. (yonik)
|
||||
|
||||
|
||||
* SOLR-1656: XIncludes and other HREFs in XML files loaded by ResourceLoader
|
||||
are fixed to be resolved using the URI standard (RFC 2396). The system
|
||||
identifier is no longer a plain filename with path, it gets initialized
|
||||
using a custom URI scheme "solrres:". This scheme is resolved using a
|
||||
EntityResolver that utilizes ResourceLoader
|
||||
(org.apache.solr.common.util.SystemIdResolver). This makes all relative
|
||||
pathes in Solr's config files behave like expected. This change
|
||||
introduces some backwards breaks in the API: Some config classes
|
||||
(Config, SolrConfig, IndexSchema) were changed to take
|
||||
org.xml.sax.InputSource instead of InputStream. There may also be some
|
||||
backwards breaks in existing config files, it is recommended to check
|
||||
your config files / XSLTs and replace all XIncludes/HREFs that were
|
||||
are fixed to be resolved using the URI standard (RFC 2396). The system
|
||||
identifier is no longer a plain filename with path, it gets initialized
|
||||
using a custom URI scheme "solrres:". This scheme is resolved using a
|
||||
EntityResolver that utilizes ResourceLoader
|
||||
(org.apache.solr.common.util.SystemIdResolver). This makes all relative
|
||||
pathes in Solr's config files behave like expected. This change
|
||||
introduces some backwards breaks in the API: Some config classes
|
||||
(Config, SolrConfig, IndexSchema) were changed to take
|
||||
org.xml.sax.InputSource instead of InputStream. There may also be some
|
||||
backwards breaks in existing config files, it is recommended to check
|
||||
your config files / XSLTs and replace all XIncludes/HREFs that were
|
||||
hacked to use absolute paths to use relative ones. (uschindler)
|
||||
|
||||
* SOLR-309: Fix FieldType so setting an analyzer on a FieldType that
|
||||
doesn't expect it will generate an error. Practically speaking this
|
||||
means that Solr will now correctly generate an error on
|
||||
initialization if the schema.xml contains an analyzer configuration
|
||||
for a fieldType that does not use TextField. (hossman)
|
||||
|
||||
* SOLR-2192: StreamingUpdateSolrServer.blockUntilFinished was not
|
||||
thread safe and could throw an exception. (yonik)
|
||||
|
||||
Other Changes
|
||||
----------------------
|
||||
|
|
Loading…
Reference in New Issue