mirror of
https://github.com/apache/lucene.git
synced 2025-02-08 19:15:06 +00:00
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1701621 13f79535-47bb-0310-9956-ffa450edef68
81 lines
3.6 KiB
Plaintext
81 lines
3.6 KiB
Plaintext
# Apache Lucene Migration Guide
|
|
|
|
## The way how number of document calculated is changed (LUCENE-6711)
|
|
The number of documents (numDocs) is used to calculate term specificity (idf) and average document length (avdl).
|
|
Prior to LUCENE-6711, collectionStats.maxDoc() was used for the statistics.
|
|
Now, collectionStats.docCount() is used whenever possible, if not maxDocs() is used.
|
|
|
|
Assume that a collection contains 100 documents, and 50 of them have "keywords" field.
|
|
In this example, maxDocs is 100 while docCount is 50 for the "keywords" field.
|
|
The total number of tokens for "keywords" field is divided by docCount to obtain avdl.
|
|
Therefore, docCount which is the total number of documents that have at least one term for the field, is a more precise metric for optional fields.
|
|
|
|
DefaultSimilarity does not leverage avdl, so this change would have relatively minor change in the result list.
|
|
Because relative idf values of terms will remain same.
|
|
However, when combined with other factors such as term frequency, relative ranking of documents could change.
|
|
Some Similarity implementations (such as the ones instantiated with NormalizationH2 and BM25) take account into avdl and would have notable change in ranked list.
|
|
Especially if you have a collection of documents with varying lengths.
|
|
Because NormalizationH2 tends to punish documents longer than avdl.
|
|
|
|
## Separation of IndexDocument and StoredDocument (LUCENE-3312)
|
|
|
|
The API of oal.document was restructured to differentiate between stored
|
|
documents and indexed documents. IndexReader.document(int) now returns
|
|
StoredDocument instead of Document. In most cases a simple replacement
|
|
of the return type is enough to upgrade.
|
|
|
|
## FunctionValues.exist() Behavior Changes due to ValueSource bug fixes (LUCENE-5961)
|
|
|
|
Bugs fixed in several ValueSource functions may result in different behavior in
|
|
situations where some documents do not have values for fields wrapped in other
|
|
ValueSources. Users who want to preserve the previous behavior may need to wrap
|
|
their ValueSources in a "DefFunction" along with a ConstValueSource of "0.0".
|
|
|
|
## Removal of FilteredQuery (LUCENE-6583)
|
|
|
|
FilteredQuery has been removed. Instead, you can construct a BooleanQuery with
|
|
one MUST clause for the query, and one FILTER clause for the filter.
|
|
|
|
## PhraseQuery and BooleanQuery made immutable (LUCENE-6531 LUCENE-6570)
|
|
|
|
PhraseQuery and BooleanQuery are now immutable and have a builder API to help
|
|
construct them. For instance a BooleanQuery that used to be constructed like
|
|
this:
|
|
|
|
BooleanQuery bq = new BooleanQuery();
|
|
bq.add(q1, Occur.SHOULD);
|
|
bq.add(q2, Occur.SHOULD);
|
|
bq.add(q3, Occur.MUST);
|
|
bq.setMinimumNumberShouldMatch(1);
|
|
|
|
can now be constructed this way using its builder:
|
|
|
|
BooleanQuery bq = new BooleanQuery.Builder()
|
|
.add(q1, Occur.SHOULD)
|
|
.add(q2, Occur.SHOULD)
|
|
.add(q3, Occur.SHOULD)
|
|
.setMinimumNumberShouldMatch(1)
|
|
.build();
|
|
|
|
## AttributeImpl now requires that reflectWith() is implemented (LUCENE-6651)
|
|
|
|
AttributeImpl removed the default, reflection-based implementation of
|
|
reflectWith(AtrributeReflector). The method was made abstract. If you have
|
|
implemented your own attribute, make sure to add the required method sigature.
|
|
See the Javadocs for an example.
|
|
|
|
## Query.setBoost() and Query.clone() are removed (LUCENE-6590)
|
|
|
|
Query.setBoost has been removed. In order to apply a boost to a Query, you now
|
|
need to wrap it inside a BoostQuery. For instance,
|
|
|
|
Query q = ...;
|
|
float boost = ...;
|
|
q = new BoostQuery(q, boost);
|
|
|
|
would be equivalent to the following code with the old setBoost API:
|
|
|
|
Query q = ...;
|
|
float boost = ...;
|
|
q.setBoost(q.getBoost() * boost);
|