Disable suffix sharing for block tree index (#12722)

This commit is contained in:
gf2121 2023-10-26 12:34:48 +08:00 committed by GitHub
parent 12fc7bf49f
commit c701a5d9be
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 9 additions and 1 deletions

View File

@ -231,6 +231,9 @@ Optimizations
* GITHUB#12712: Speed up sorting postings file with an offline radix sorter in BPIndexReader. (Guo Feng)
* GITHUB#12702: Disable suffix sharing for block tree index, making writing the terms dictionary index faster
and less RAM hungry, while making the index a bit (~1.X% for the terms index file on wikipedia). (Guo Feng, Mike McCandless)
Changes in runtime behavior
---------------------

View File

@ -521,7 +521,12 @@ public final class Lucene90BlockTreeTermsWriter extends FieldsConsumer {
final ByteSequenceOutputs outputs = ByteSequenceOutputs.getSingleton();
final FSTCompiler<BytesRef> fstCompiler =
new FSTCompiler.Builder<>(FST.INPUT_TYPE.BYTE1, outputs).bytesPageBits(pageBits).build();
new FSTCompiler.Builder<>(FST.INPUT_TYPE.BYTE1, outputs)
// Disable suffixes sharing for block tree index because suffixes are mostly dropped
// from the FST index and left in the term blocks.
.suffixRAMLimitMB(0d)
.bytesPageBits(pageBits)
.build();
// if (DEBUG) {
// System.out.println(" compile index for prefix=" + prefix);
// }