mirror of https://github.com/apache/lucene.git
add comments from Doug describing how BooleanScorers work
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@727475 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
d3987d9ed4
commit
4561cb369d
|
@ -19,6 +19,39 @@ package org.apache.lucene.search;
|
|||
|
||||
import java.io.IOException;
|
||||
|
||||
/* Description from Doug Cutting (excerpted from
|
||||
* LUCENE-1483):
|
||||
*
|
||||
* BooleanScorer uses a ~16k array to score windows of
|
||||
* docs. So it scores docs 0-16k first, then docs 16-32k,
|
||||
* etc. For each window it iterates through all query terms
|
||||
* and accumulates a score in table[doc%16k]. It also stores
|
||||
* in the table a bitmask representing which terms
|
||||
* contributed to the score. Non-zero scores are chained in
|
||||
* a linked list. At the end of scoring each window it then
|
||||
* iterates through the linked list and, if the bitmask
|
||||
* matches the boolean constraints, collects a hit. For
|
||||
* boolean queries with lots of frequent terms this can be
|
||||
* much faster, since it does not need to update a priority
|
||||
* queue for each posting, instead performing constant-time
|
||||
* operations per posting. The only downside is that it
|
||||
* results in hits being delivered out-of-order within the
|
||||
* window, which means it cannot be nested within other
|
||||
* scorers. But it works well as a top-level scorer.
|
||||
*
|
||||
* The new BooleanScorer2 implementation instead works by
|
||||
* merging priority queues of postings, albeit with some
|
||||
* clever tricks. For example, a pure conjunction (all terms
|
||||
* required) does not require a priority queue. Instead it
|
||||
* sorts the posting streams at the start, then repeatedly
|
||||
* skips the first to to the last. If the first ever equals
|
||||
* the last, then there's a hit. When some terms are
|
||||
* required and some terms are optional, the conjunction can
|
||||
* be evaluated first, then the optional terms can all skip
|
||||
* to the match and be added to the score. Thus the
|
||||
* conjunction can reduce the number of priority queue
|
||||
* updates for the optional terms. */
|
||||
|
||||
final class BooleanScorer extends Scorer {
|
||||
private SubScorer scorers = null;
|
||||
private BucketTable bucketTable = new BucketTable();
|
||||
|
|
|
@ -22,6 +22,9 @@ import java.util.ArrayList;
|
|||
import java.util.List;
|
||||
import java.util.Iterator;
|
||||
|
||||
/* See the description in BooleanScorer.java, comparing
|
||||
* BooleanScorer & BooleanScorer2 */
|
||||
|
||||
/** An alternative to BooleanScorer that also allows a minimum number
|
||||
* of optional scorers that should match.
|
||||
* <br>Implements skipTo(), and has no limitations on the numbers of added scorers.
|
||||
|
|
Loading…
Reference in New Issue