Fix MaxScoreBulkScorer#score's return value. (#12400)

`AssertingBulkScorer` asserts that the return value of `BulkScorer#score` may
not be in `[maxDoc, NO_MORE_DOCS)`. While this is not part of the contract of
`BulkScorer#score`, a reasonable implementation should never have return values
in this range, as it would suggest that more matches need collecting when we're
already out of the range of valid doc IDs. So this generally indicates a bug.

`MaxScoreBulkScorer` failed this assertion, because it can sometimes skip the
requested window of doc IDs, when the sum of maximum scores would be less than
the minimum competitive score. In that case, the best information it has is
that there are no matches in the window, but it cannot give a good estimate of
the next potential match.

This assertion in `AssertingBulkScorer` looks sane to me, so I made a small
change to `MaxScoreBulkScorer` to make sure it meets `AssertingScorer`'s
expectations. This is done in a place that is only called once per scored
window, so it should not have a noticeable performance impact.
This commit is contained in:
Adrien Grand 2023-06-28 09:17:41 +02:00 committed by GitHub
parent 6bb8cc0235
commit 4029cc37a7
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 8 additions and 2 deletions

View File

@ -218,7 +218,7 @@ final class BooleanWeight extends Weight {
optionalScorers.add(ss.get(Long.MAX_VALUE)); optionalScorers.add(ss.get(Long.MAX_VALUE));
} }
return new MaxScoreBulkScorer(optionalScorers); return new MaxScoreBulkScorer(context.reader().maxDoc(), optionalScorers);
} }
List<BulkScorer> optional = new ArrayList<BulkScorer>(); List<BulkScorer> optional = new ArrayList<BulkScorer>();

View File

@ -24,6 +24,7 @@ import org.apache.lucene.util.Bits;
final class MaxScoreBulkScorer extends BulkScorer { final class MaxScoreBulkScorer extends BulkScorer {
private final int maxDoc;
// All scorers, sorted by increasing max score. // All scorers, sorted by increasing max score.
private final DisiWrapper[] allScorers; private final DisiWrapper[] allScorers;
// These are the last scorers from `allScorers` that are "essential", ie. required for a match to // These are the last scorers from `allScorers` that are "essential", ie. required for a match to
@ -39,7 +40,8 @@ final class MaxScoreBulkScorer extends BulkScorer {
private ScoreAndDoc scorable = new ScoreAndDoc(); private ScoreAndDoc scorable = new ScoreAndDoc();
private final double[] maxScoreSums; private final double[] maxScoreSums;
MaxScoreBulkScorer(List<Scorer> scorers) throws IOException { MaxScoreBulkScorer(int maxDoc, List<Scorer> scorers) throws IOException {
this.maxDoc = maxDoc;
allScorers = new DisiWrapper[scorers.size()]; allScorers = new DisiWrapper[scorers.size()];
int i = 0; int i = 0;
long cost = 0; long cost = 0;
@ -181,6 +183,10 @@ final class MaxScoreBulkScorer extends BulkScorer {
/** Return the next candidate on or after {@code rangeEnd}. */ /** Return the next candidate on or after {@code rangeEnd}. */
private int nextCandidate(int rangeEnd) { private int nextCandidate(int rangeEnd) {
if (rangeEnd >= maxDoc) {
return DocIdSetIterator.NO_MORE_DOCS;
}
int next = DocIdSetIterator.NO_MORE_DOCS; int next = DocIdSetIterator.NO_MORE_DOCS;
for (DisiWrapper scorer : allScorers) { for (DisiWrapper scorer : allScorers) {
if (scorer.doc < rangeEnd) { if (scorer.doc < rangeEnd) {