LUCENE-10574: Keep allowing unbalanced merges if they would reclaim lots of deletes. (#905)

`TestTieredMergePolicy` caught this special case: if a segment has lots of
deletes, we should still allow unbalanced merges.
This commit is contained in:
Adrien Grand 2022-05-20 10:06:38 +02:00 committed by GitHub
parent 8e777a1320
commit 5e9dfbed27
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 6 additions and 1 deletions

View File

@ -536,11 +536,16 @@ public class TieredMergePolicy extends MergePolicy {
SegmentSizeAndDocs maxCandidateSegmentSize = segInfosSizes.get(candidate.get(0));
if (hitTooLarge == false
&& mergeType == MERGE_TYPE.NATURAL
&& bytesThisMerge < maxCandidateSegmentSize.sizeInBytes * 1.5) {
&& bytesThisMerge < maxCandidateSegmentSize.sizeInBytes * 1.5
&& maxCandidateSegmentSize.delCount
< maxCandidateSegmentSize.maxDoc * deletesPctAllowed / 100) {
// Ignore any merge where the resulting segment is not at least 50% larger than the
// biggest input segment.
// Otherwise we could run into pathological O(N^2) merging where merges keep rewriting
// again and again the biggest input segment into a segment that is barely bigger.
// The only exception we make is when the merge would reclaim lots of deletes in the
// biggest segment. This is important for cases when lots of documents get deleted at once
// without introducing new segments of a similar size for instance.
continue;
}