diff --git a/lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90TermVectorsFormat.java b/lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90TermVectorsFormat.java
index 0142f5461e8..e19168ff95d 100644
--- a/lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90TermVectorsFormat.java
+++ b/lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90TermVectorsFormat.java
@@ -56,15 +56,20 @@ import org.apache.lucene.util.packed.PackedInts;
*
VectorMeta (.tvm) --> <Header>, PackedIntsVersion, ChunkSize,
* ChunkIndexMetadata, ChunkCount, DirtyChunkCount, DirtyDocsCount, Footer
* Header --> {@link CodecUtil#writeIndexHeader IndexHeader}
- * PackedIntsVersion --> {@link PackedInts#VERSION_CURRENT} as a {@link
- * DataOutput#writeVInt VInt}
- * ChunkSize is the number of bytes of terms to accumulate before flushing, as a {@link
- * DataOutput#writeVInt VInt}
- * ChunkCount is not known in advance and is the number of chunks necessary to store all
- * document of the segment
- * DirtyChunkCount --> the number of prematurely flushed chunks in the .tvd file
+ * PackedIntsVersion, ChunkSize --> {@link DataOutput#writeVInt VInt}
+ * ChunkCount, DirtyChunkCount, DirtyDocsCount --> {@link DataOutput#writeVLong
+ * VLong}
+ * ChunkIndexMetadata --> {@link FieldsIndexWriter}
* Footer --> {@link CodecUtil#writeFooter CodecFooter}
*
+ * Notes:
+ *
+ * - PackedIntsVersion is {@link PackedInts#VERSION_CURRENT}.
+ *
- ChunkSize is the number of bytes of terms to accumulate before flushing.
+ *
- ChunkCount is not known in advance and is the number of chunks necessary to store all
+ * document of the segment.
+ *
- DirtyChunkCount is the number of prematurely flushed chunks in the .tvd file.
+ *
*
* A vector data file (extension .tvd
). This file stores terms, frequencies,
* positions, offsets and payloads for every document. Upon writing a new segment, it
@@ -80,76 +85,78 @@ import org.apache.lucene.util.packed.PackedInts;
* FieldNumOffs >, < Flags >, < NumTerms >, < TermLengths >, <
* TermFreqs >, < Positions >, < StartOffsets >, < Lengths >, <
* PayloadLengths >, < TermAndPayloads >
- *
DocBase is the ID of the first doc of the chunk as a {@link DataOutput#writeVInt
- * VInt}
- * ChunkDocs is the number of documents in the chunk
* NumFields --> DocNumFieldsChunkDocs
- * DocNumFields is the number of fields for each doc, written as a {@link
- * DataOutput#writeVInt VInt} if ChunkDocs==1 and as a {@link PackedInts} array
- * otherwise
- * FieldNums --> FieldNumDeltaTotalDistincFields, a delta-encoded list of
- * the sorted unique field numbers present in the chunk
- * FieldNumOffs --> FieldNumOffTotalFields, as a {@link PackedInts} array
- * FieldNumOff is the offset of the field number in FieldNums
- * TotalFields is the total number of fields (sum of the values of NumFields)
+ * FieldNums --> FieldNumDeltaTotalDistincFields
* Flags --> Bit < FieldFlags >
- * Bit is a single bit which when true means that fields have the same options for every
- * document in the chunk
* FieldFlags --> if Bit==1: FlagTotalDistinctFields else
* FlagTotalFields
+ * NumTerms --> FieldNumTermsTotalFields
+ * TermLengths --> PrefixLengthTotalTerms
+ * SuffixLengthTotalTerms
+ * TermFreqs --> TermFreqMinus1TotalTerms
+ * Positions --> PositionDeltaTotalPositions
+ * StartOffsets --> (AvgCharsPerTermTotalDistinctFields)
+ * StartOffsetDeltaTotalOffsets
+ * Lengths --> LengthMinusTermLengthTotalOffsets
+ * PayloadLengths --> PayloadLengthTotalPayloads
+ * TermAndPayloads --> LZ4-compressed representation of < FieldTermsAndPayLoads
+ * >TotalFields
+ * FieldTermsAndPayLoads --> Terms (Payloads)
+ * DocBase, ChunkDocs, DocNumFields (with ChunkDocs==1) --> {@link
+ * DataOutput#writeVInt VInt}
+ * AvgCharsPerTerm --> {@link DataOutput#writeInt Int}
+ * DocNumFields (with ChunkDocs>=1), FieldNumOffs --> {@link PackedInts} array
+ * FieldNumTerms, PrefixLength, SuffixLength, TermFreqMinus1, PositionDelta,
+ * StartOffsetDelta, LengthMinusTermLength, PayloadLength --> {@link
+ * BlockPackedWriter blocks of 64 packed ints}
+ * Footer --> {@link CodecUtil#writeFooter CodecFooter}
+ *
+ * Notes:
+ *
+ * - DocBase is the ID of the first doc of the chunk.
+ *
- ChunkDocs is the number of documents in the chunk.
+ *
- DocNumFields is the number of fields for each doc.
+ *
- FieldNums is a delta-encoded list of the sorted unique field numbers present in the
+ * chunk.
+ *
- FieldNumOffs is the array of FieldNumOff; array size is the total number of fields in
+ * the chunk.
+ *
- FieldNumOff is the offset of the field number in FieldNums.
+ *
- TotalFields is the total number of fields (sum of the values of NumFields).
+ *
- Bit in Flags is a single bit which when true means that fields have the same options
+ * for every document in the chunk.
*
- Flag: a 3-bits int where:
*
* - the first bit means that the field has positions
*
- the second bit means that the field has offsets
*
- the third bit means that the field has payloads
*
- * - NumTerms --> FieldNumTermsTotalFields
- *
- FieldNumTerms: the number of terms for each field, using {@link BlockPackedWriter
- * blocks of 64 packed ints}
- *
- TermLengths --> PrefixLengthTotalTerms
- * SuffixLengthTotalTerms
- *
- TotalTerms: total number of terms (sum of NumTerms)
- *
- PrefixLength: 0 for the first term of a field, the common prefix with the previous
- * term otherwise using {@link BlockPackedWriter blocks of 64 packed ints}
- *
- SuffixLength: length of the term minus PrefixLength for every term using {@link
- * BlockPackedWriter blocks of 64 packed ints}
- *
- TermFreqs --> TermFreqMinus1TotalTerms
- *
- TermFreqMinus1: (frequency - 1) for each term using {@link BlockPackedWriter blocks
- * of 64 packed ints}
- *
- Positions --> PositionDeltaTotalPositions
- *
- TotalPositions is the sum of frequencies of terms of all fields that have positions
- *
- PositionDelta: the absolute position for the first position of a term, and the
- * difference with the previous positions for following positions using {@link
- * BlockPackedWriter blocks of 64 packed ints}
- *
- StartOffsets --> (AvgCharsPerTermTotalDistinctFields)
- * StartOffsetDeltaTotalOffsets
- *
- TotalOffsets is the sum of frequencies of terms of all fields that have offsets
- *
- AvgCharsPerTerm: average number of chars per term, encoded as a float on 4 bytes.
- * They are not present if no field has both positions and offsets enabled.
- *
- StartOffsetDelta: (startOffset - previousStartOffset - AvgCharsPerTerm *
+ *
- FieldNumTerms is the number of terms for each field.
+ *
- TotalTerms is the total number of terms (sum of NumTerms).
+ *
- PrefixLength is 0 for the first term of a field, the common prefix with the previous
+ * term otherwise.
+ *
- SuffixLength is the length of the term minus PrefixLength for every term using.
+ *
- TermFreqMinus1 is (frequency - 1) for each term.
+ *
- TotalPositions is the sum of frequencies of terms of all fields that have positions.
+ *
- PositionDelta is the absolute position for the first position of a term, and the
+ * difference with the previous positions for following positions.
+ *
- TotalOffsets is the sum of frequencies of terms of all fields that have offsets.
+ *
- AvgCharsPerTerm is the average number of chars per term, encoded as a float on 4
+ * bytes. They are not present if no field has both positions and offsets enabled.
+ *
- StartOffsetDelta is the (startOffset - previousStartOffset - AvgCharsPerTerm *
* PositionDelta). previousStartOffset is 0 for the first offset and AvgCharsPerTerm is
- * 0 if the field has no positions using {@link BlockPackedWriter blocks of 64 packed
- * ints}
- *
- Lengths --> LengthMinusTermLengthTotalOffsets
- *
- LengthMinusTermLength: (endOffset - startOffset - termLength) using {@link
- * BlockPackedWriter blocks of 64 packed ints}
- *
- PayloadLengths --> PayloadLengthTotalPayloads
- *
- TotalPayloads is the sum of frequencies of terms of all fields that have payloads
- *
- PayloadLength is the payload length encoded using {@link BlockPackedWriter blocks of
- * 64 packed ints}
- *
- TermAndPayloads --> LZ4-compressed representation of < FieldTermsAndPayLoads
- * >TotalFields
- *
- FieldTermsAndPayLoads --> Terms (Payloads)
- *
- Terms: term bytes
- *
- Payloads: payload bytes (if the field has payloads)
- *
- Footer --> {@link CodecUtil#writeFooter CodecFooter}
+ * 0 if the field has no positions.
+ *
- LengthMinusTermLength is (endOffset - startOffset - termLength).
+ *
- TotalPayloads is the sum of frequencies of terms of all fields that have payloads.
+ *
- PayloadLength is the payload length encoded.
+ *
- Terms is term bytes.
+ *
- Payloads is payload bytes (if the field has payloads).
*
*
* An index file (extension .tvx
).
*
* - VectorIndex (.tvx) --> <Header>, <ChunkIndex>, Footer
*
- Header --> {@link CodecUtil#writeIndexHeader IndexHeader}
- *
- ChunkIndex: See {@link FieldsIndexWriter}
+ *
- ChunkIndex --> {@link FieldsIndexWriter}
*
- Footer --> {@link CodecUtil#writeFooter CodecFooter}
*
*