add comments from Doug describing how BooleanScorers work

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@727475 13f79535-47bb-0310-9956-ffa450edef68
2008-12-17 19:16:10 +00:00 · 2008-12-17 19:16:10 +00:00 · 4561cb369d
parent d3987d9ed4
commit 4561cb369d
2 changed files with 36 additions and 0 deletions
--- a/src/java/org/apache/lucene/search/BooleanScorer.java
+++ b/src/java/org/apache/lucene/search/BooleanScorer.java
@ -19,6 +19,39 @@ package org.apache.lucene.search;
 import java.io.IOException;
 /* Description from Doug Cutting (excerpted from
 * LUCENE-1483):
 *
 * BooleanScorer uses a ~16k array to score windows of
 * docs. So it scores docs 0-16k first, then docs 16-32k,
 * etc. For each window it iterates through all query terms
 * and accumulates a score in table[doc%16k]. It also stores
 * in the table a bitmask representing which terms
 * contributed to the score. Non-zero scores are chained in
 * a linked list. At the end of scoring each window it then
 * iterates through the linked list and, if the bitmask
 * matches the boolean constraints, collects a hit. For
 * boolean queries with lots of frequent terms this can be
 * much faster, since it does not need to update a priority
 * queue for each posting, instead performing constant-time
 * operations per posting. The only downside is that it
 * results in hits being delivered out-of-order within the
 * window, which means it cannot be nested within other
 * scorers. But it works well as a top-level scorer.
 *
 * The new BooleanScorer2 implementation instead works by
 * merging priority queues of postings, albeit with some
 * clever tricks. For example, a pure conjunction (all terms
 * required) does not require a priority queue. Instead it
 * sorts the posting streams at the start, then repeatedly
 * skips the first to to the last. If the first ever equals
 * the last, then there's a hit. When some terms are
 * required and some terms are optional, the conjunction can
 * be evaluated first, then the optional terms can all skip
 * to the match and be added to the score. Thus the
 * conjunction can reduce the number of priority queue
 * updates for the optional terms. */
 final class BooleanScorer extends Scorer {
  private SubScorer scorers = null;
  private BucketTable bucketTable = new BucketTable();
--- a/src/java/org/apache/lucene/search/BooleanScorer2.java
+++ b/src/java/org/apache/lucene/search/BooleanScorer2.java
@ -22,6 +22,9 @@ import java.util.ArrayList;
 import java.util.List;
 import java.util.Iterator;
 /* See the description in BooleanScorer.java, comparing
 * BooleanScorer & BooleanScorer2 */
 /** An alternative to BooleanScorer that also allows a minimum number
 * of optional scorers that should match.
 * <br>Implements skipTo(), and has no limitations on the numbers of added scorers.