Speed up sorting on unique string fields. (#11903)

Since increasing the number of hits retrieved in nightly benchmarks from 10 to
100, the performance of sorting documents by title dropped back to the level it
had before introducing dynamic pruning. This is not too surprising given that
the `title` field is a unique field, so the optimization would only kick in
when the current 100th hit would have an ordinal that is less than 128 -
something that would only happen after collecting most hits.

This change increases the threshold to 1024, so that the optimization would
kick in when the current 100th hit has an ordinal that is less than 1024,
something that happens a bit sooner.
This commit is contained in:
Adrien Grand 2023-11-02 14:16:11 +01:00 committed by GitHub
parent 4b3f7662ce
commit 5b87a31556
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 3 additions and 1 deletions

View File

@ -253,6 +253,8 @@ Optimizations
* GITHUB#12719: Top-level conjunctions that are not sorted by score now have a
specialized bulk scorer. (Adrien Grand)
* GITHUB#11903: Faster sort on high-cardinality string fields. (Adrien Grand)
Changes in runtime behavior
---------------------

View File

@ -475,7 +475,7 @@ public class TermOrdValComparator extends FieldComparator<BytesRef> {
private class CompetitiveIterator extends DocIdSetIterator {
private static final int MAX_TERMS = 128;
private static final int MAX_TERMS = 1024;
private final LeafReaderContext context;
private final int maxDoc;