LUCENE-1848: remove old version references where it makes sense

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@807653 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Grant Ingersoll 2009-08-25 14:36:47 +00:00
parent 3519f543e7
commit 7dd9b440aa
3 changed files with 398 additions and 591 deletions

View File

@ -368,7 +368,7 @@ document.write("Last Published: " + document.lastModified);
<div class="section"> <div class="section">
<p> <p>
This document defines the index file formats used This document defines the index file formats used
in Lucene version 2.1. If you are using a different in Lucene version 2.9. If you are using a different
version of Lucene, please consult the copy of version of Lucene, please consult the copy of
<span class="codefrag">docs/fileformats.html</span> <span class="codefrag">docs/fileformats.html</span>
that was distributed that was distributed
@ -382,7 +382,7 @@ document.write("Last Published: " + document.lastModified);
languages</a>. If these versions are to remain compatible with Apache languages</a>. If these versions are to remain compatible with Apache
Lucene, then a language-independent definition of the Lucene index Lucene, then a language-independent definition of the Lucene index
format is required. This document thus attempts to provide a format is required. This document thus attempts to provide a
complete and independent definition of the Apache Lucene 2.1 file complete and independent definition of the Apache Lucene 2.9 file
formats. formats.
</p> </p>
<p> <p>
@ -786,7 +786,7 @@ document.write("Last Published: " + document.lastModified);
<tr> <tr>
<td><a href="#Normalization Factors">Norms</a></td> <td><a href="#Normalization Factors">Norms</a></td>
<td>.nrm (pre 2.1: .f[0-9]*)</td> <td>.nrm</td>
<td>Encodes length and boost factors for docs and fields</td> <td>Encodes length and boost factors for docs and fields</td>
</tr> </tr>
@ -1492,37 +1492,7 @@ document.write("Last Published: " + document.lastModified);
</p> </p>
<p> <p>
<b>Pre-2.1:</b> <b>2.9</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize&gt;
<sup>SegCount</sup>
</p>
<p>
<b>2.1 and above:</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize, DelGen, HasSingleNormFile, NumField,
NormGen<sup>NumField</sup>,
IsCompoundFile&gt;<sup>SegCount</sup>
</p>
<p>
<b>2.3:</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize, DelGen, DocStoreOffset, [DocStoreSegment, DocStoreIsCompoundFile], HasSingleNormFile, NumField,
NormGen<sup>NumField</sup>,
IsCompoundFile&gt;<sup>SegCount</sup>
</p>
<p>
<b>2.4 and above:</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize, DelGen, DocStoreOffset, [DocStoreSegment, DocStoreIsCompoundFile], HasSingleNormFile, NumField,
NormGen<sup>NumField</sup>,
IsCompoundFile, DeletionCount, HasProx&gt;<sup>SegCount</sup>, Checksum
</p>
<p>
<b>2.9 and above:</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize, DelGen, DocStoreOffset, [DocStoreSegment, DocStoreIsCompoundFile], HasSingleNormFile, NumField, Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize, DelGen, DocStoreOffset, [DocStoreSegment, DocStoreIsCompoundFile], HasSingleNormFile, NumField,
NormGen<sup>NumField</sup>, NormGen<sup>NumField</sup>,
IsCompoundFile, DeletionCount, HasProx, Diagnostics&gt;<sup>SegCount</sup>, CommitUserData, Checksum IsCompoundFile, DeletionCount, HasProx, Diagnostics&gt;<sup>SegCount</sup>, CommitUserData, Checksum
@ -1548,7 +1518,7 @@ document.write("Last Published: " + document.lastModified);
CommitUserData --&gt; Map&lt;String,String&gt; CommitUserData --&gt; Map&lt;String,String&gt;
</p> </p>
<p> <p>
Format is -1 as of Lucene 1.4, -3 (SegmentInfos.FORMAT_SINGLE_NORM_FILE) as of Lucene 2.1 and 2.2, -4 (SegmentInfos.FORMAT_SHARED_DOC_STORE) as of Lucene 2.3, -7 (SegmentInfos.FORMAT_HAS_PROX) as of Lucene 2.4, and -9 (SegmentInfos.FORMAT_DIAGNOSTICS) as of Lucene 2.9. Format is -9 (SegmentInfos.FORMAT_DIAGNOSTICS).
</p> </p>
<p> <p>
Version counts how often the index has been Version counts how often the index has been
@ -1648,7 +1618,7 @@ document.write("Last Published: " + document.lastModified);
Lucene version, OS, Java version, why the segment Lucene version, OS, Java version, why the segment
was created (merge, flush, addIndexes), etc. was created (merge, flush, addIndexes), etc.
</p> </p>
<a name="N105EB"></a><a name="Lock File"></a> <a name="N105BE"></a><a name="Lock File"></a>
<h3 class="boxed">Lock File</h3> <h3 class="boxed">Lock File</h3>
<p> <p>
The write lock, which is stored in the index The write lock, which is stored in the index
@ -1662,20 +1632,14 @@ document.write("Last Published: " + document.lastModified);
documents). This lock file ensures that only one documents). This lock file ensures that only one
writer is modifying the index at a time. writer is modifying the index at a time.
</p> </p>
<p> <a name="N105C7"></a><a name="Deletable File"></a>
Note that prior to version 2.1, Lucene also used a
commit lock. This was removed in 2.1.
</p>
<a name="N105F7"></a><a name="Deletable File"></a>
<h3 class="boxed">Deletable File</h3> <h3 class="boxed">Deletable File</h3>
<p> <p>
Prior to Lucene 2.1 there was a file "deletable" A writer dynamically computes
that contained details about files that need to be
deleted. As of 2.1, a writer dynamically computes
the files that are deletable, instead, so no file the files that are deletable, instead, so no file
is written. is written.
</p> </p>
<a name="N10600"></a><a name="Compound Files"></a> <a name="N105D0"></a><a name="Compound Files"></a>
<h3 class="boxed">Compound Files</h3> <h3 class="boxed">Compound Files</h3>
<p>Starting with Lucene 1.4 the compound file format became default. This <p>Starting with Lucene 1.4 the compound file format became default. This
is simply a container for all files described in the next section is simply a container for all files described in the next section
@ -1702,14 +1666,14 @@ document.write("Last Published: " + document.lastModified);
</div> </div>
<a name="N10628"></a><a name="Per-Segment Files"></a> <a name="N105F8"></a><a name="Per-Segment Files"></a>
<h2 class="boxed">Per-Segment Files</h2> <h2 class="boxed">Per-Segment Files</h2>
<div class="section"> <div class="section">
<p> <p>
The remaining files are all per-segment, and are The remaining files are all per-segment, and are
thus defined by suffix. thus defined by suffix.
</p> </p>
<a name="N10630"></a><a name="Fields"></a> <a name="N10600"></a><a name="Fields"></a>
<h3 class="boxed">Fields</h3> <h3 class="boxed">Fields</h3>
<p> <p>
@ -1755,12 +1719,6 @@ document.write("Last Published: " + document.lastModified);
without term vectors. without term vectors.
</li> </li>
<p>
<b>Lucene &gt;= 1.9:</b>
</p>
<li>If the third lowest-order bit is set (0x04), term positions are stored with the term vectors.</li> <li>If the third lowest-order bit is set (0x04), term positions are stored with the term vectors.</li>
<li>If the fourth lowest-order bit is set (0x08), term offsets are stored with the term vectors.</li> <li>If the fourth lowest-order bit is set (0x08), term offsets are stored with the term vectors.</li>
@ -1873,31 +1831,6 @@ document.write("Last Published: " + document.lastModified);
VInt VInt
</p> </p>
<p>
<b>Lucene &lt;= 1.4:</b>
</p>
<p>Bits --&gt;
Byte
</p>
<p>Value --&gt;
String
</p>
<p>Only the low-order bit of Bits is used. It is one for
tokenized fields, and zero for non-tokenized fields.
</p>
<p>
<b>Lucene &gt;= 1.9:</b>
</p>
<p>Bits --&gt; <p>Bits --&gt;
Byte Byte
</p> </p>
@ -1933,7 +1866,7 @@ document.write("Last Published: " + document.lastModified);
</li> </li>
</ol> </ol>
<a name="N106F2"></a><a name="Term Dictionary"></a> <a name="N106A7"></a><a name="Term Dictionary"></a>
<h3 class="boxed">Term Dictionary</h3> <h3 class="boxed">Term Dictionary</h3>
<p> <p>
The term dictionary is represented as two files: The term dictionary is represented as two files:
@ -2006,7 +1939,7 @@ document.write("Last Published: " + document.lastModified);
</p> </p>
<p>TIVersion names the version of the format <p>TIVersion names the version of the format
of this file and is -2 in Lucene 1.4. of this file and is equal to TermInfosWriter.FORMAT_CURRENT.
</p> </p>
<p>Term <p>Term
@ -2125,7 +2058,7 @@ document.write("Last Published: " + document.lastModified);
</li> </li>
</ol> </ol>
<a name="N10776"></a><a name="Frequencies"></a> <a name="N1072B"></a><a name="Frequencies"></a>
<h3 class="boxed">Frequencies</h3> <h3 class="boxed">Frequencies</h3>
<p> <p>
The .frq file contains the lists of documents The .frq file contains the lists of documents
@ -2241,7 +2174,7 @@ document.write("Last Published: " + document.lastModified);
<sup>nd</sup> <sup>nd</sup>
starts. starts.
</p> </p>
<p>Lucene 2.2 introduces the notion of skip levels. Each term can have multiple skip levels. <p>Each term can have multiple skip levels.
The amount of skip levels for a term is NumSkipLevels = Min(MaxSkipLevels, floor(log(DocFreq/log(SkipInterval)))). The amount of skip levels for a term is NumSkipLevels = Min(MaxSkipLevels, floor(log(DocFreq/log(SkipInterval)))).
The number of SkipData entries for a skip level is DocFreq/(SkipInterval^(Level + 1)), whereas the lowest skip The number of SkipData entries for a skip level is DocFreq/(SkipInterval^(Level + 1)), whereas the lowest skip
level is Level=0. <br> level is Level=0. <br>
@ -2253,7 +2186,7 @@ document.write("Last Published: " + document.lastModified);
entry in level-1. In the example has entry 15 on level 1 a pointer to entry 15 on level 0 and entry 31 on level 1 a pointer entry in level-1. In the example has entry 15 on level 1 a pointer to entry 15 on level 0 and entry 31 on level 1 a pointer
to entry 31 on level 0. to entry 31 on level 0.
</p> </p>
<a name="N107FE"></a><a name="Positions"></a> <a name="N107B3"></a><a name="Positions"></a>
<h3 class="boxed">Positions</h3> <h3 class="boxed">Positions</h3>
<p> <p>
The .prx file contains the lists of positions that The .prx file contains the lists of positions that
@ -2323,25 +2256,9 @@ document.write("Last Published: " + document.lastModified);
Payload. If PayloadLength is not stored, then this Payload has the same Payload. If PayloadLength is not stored, then this Payload has the same
length as the Payload at the previous position. length as the Payload at the previous position.
</p> </p>
<a name="N1083A"></a><a name="Normalization Factors"></a> <a name="N107EF"></a><a name="Normalization Factors"></a>
<h3 class="boxed">Normalization Factors</h3> <h3 class="boxed">Normalization Factors</h3>
<p> <p>There's a single .nrm file containing all norms:
<b>Pre-2.1:</b>
There's a norm file for each indexed field with a byte for
each document. The .f[0-9]* file contains,
for each document, a byte that encodes a value that is multiplied
into the score for hits on that field:
</p>
<p>Norms
(.f[0-9]*) --&gt; &lt;Byte&gt;
<sup>SegSize</sup>
</p>
<p>
<b>2.1 and above:</b>
There's a single .nrm file containing all norms:
</p> </p>
<p>AllNorms <p>AllNorms
(.nrm) --&gt; NormsHeader,&lt;Norms&gt; (.nrm) --&gt; NormsHeader,&lt;Norms&gt;
@ -2417,17 +2334,9 @@ document.write("Last Published: " + document.lastModified);
When field <em>N</em> is modified, a separate norm file <em>.sN</em> When field <em>N</em> is modified, a separate norm file <em>.sN</em>
is created, to maintain the norm values for that field. is created, to maintain the norm values for that field.
</p> </p>
<p> <p>Separate norm files are created (when adequate) for both compound and non compound segments.
<b>Pre-2.1:</b>
Separate norm files are created only for compound segments.
</p> </p>
<p> <a name="N10840"></a><a name="Term Vectors"></a>
<b>2.1 and above:</b>
Separate norm files are created (when adequate) for both compound and non compound segments.
</p>
<a name="N108A3"></a><a name="Term Vectors"></a>
<h3 class="boxed">Term Vectors</h3> <h3 class="boxed">Term Vectors</h3>
<p> <p>
Term Vector support is an optional on a field by Term Vector support is an optional on a field by
@ -2450,7 +2359,7 @@ document.write("Last Published: " + document.lastModified);
</p> </p>
<p>TVXVersion --&gt; Int (3 (TermVectorsReader.FORMAT_VERSION2) for Lucene 2.4)</p> <p>TVXVersion --&gt; Int (TermVectorsReader.CURRENT)</p>
<p>DocumentPosition --&gt; UInt64 (offset in <p>DocumentPosition --&gt; UInt64 (offset in
the .tvd file)</p> the .tvd file)</p>
@ -2475,7 +2384,7 @@ document.write("Last Published: " + document.lastModified);
</p> </p>
<p>TVDVersion --&gt; Int (3 (TermVectorsReader.FORMAT_VERSION2) for Lucene 2.4)</p> <p>TVDVersion --&gt; Int (TermVectorsReader.FORMAT_CURRENT)</p>
<p>NumFields --&gt; VInt</p> <p>NumFields --&gt; VInt</p>
@ -2511,7 +2420,7 @@ document.write("Last Published: " + document.lastModified);
</p> </p>
<p>TVFVersion --&gt; Int (3 (TermVectorsReader.FORMAT_VERSION2) for Lucene 2.4)</p> <p>TVFVersion --&gt; Int (TermVectorsReader.FORMAT_CURRENT)</p>
<p>NumTerms --&gt; VInt</p> <p>NumTerms --&gt; VInt</p>
@ -2563,7 +2472,7 @@ document.write("Last Published: " + document.lastModified);
</li> </li>
</ol> </ol>
<a name="N1093F"></a><a name="Deleted Documents"></a> <a name="N108DC"></a><a name="Deleted Documents"></a>
<h3 class="boxed">Deleted Documents</h3> <h3 class="boxed">Deleted Documents</h3>
<p>The .del file is <p>The .del file is
optional, and only exists when a segment contains deletions. optional, and only exists when a segment contains deletions.
@ -2571,14 +2480,6 @@ document.write("Last Published: " + document.lastModified);
<p>Although per-segment, this file is maintained exterior to compound segment files. <p>Although per-segment, this file is maintained exterior to compound segment files.
</p> </p>
<p> <p>
<b>Pre-2.1:</b>
Deletions
(.del) --&gt; ByteCount,BitCount,Bits
</p>
<p>
<b>2.1 and above:</b>
Deletions Deletions
(.del) --&gt; [Format],ByteCount,BitCount, Bits | DGaps (depending on Format) (.del) --&gt; [Format],ByteCount,BitCount, Bits | DGaps (depending on Format)
</p> </p>
@ -2635,7 +2536,7 @@ document.write("Last Published: " + document.lastModified);
</div> </div>
<a name="N10982"></a><a name="Limitations"></a> <a name="N10916"></a><a name="Limitations"></a>
<h2 class="boxed">Limitations</h2> <h2 class="boxed">Limitations</h2>
<div class="section"> <div class="section">
<p> <p>

File diff suppressed because it is too large Load Diff

View File

@ -12,7 +12,7 @@
<p> <p>
This document defines the index file formats used This document defines the index file formats used
in Lucene version 2.1. If you are using a different in Lucene version 2.9. If you are using a different
version of Lucene, please consult the copy of version of Lucene, please consult the copy of
<code>docs/fileformats.html</code> <code>docs/fileformats.html</code>
that was distributed that was distributed
@ -27,7 +27,7 @@
languages</a>. If these versions are to remain compatible with Apache languages</a>. If these versions are to remain compatible with Apache
Lucene, then a language-independent definition of the Lucene index Lucene, then a language-independent definition of the Lucene index
format is required. This document thus attempts to provide a format is required. This document thus attempts to provide a
complete and independent definition of the Apache Lucene 2.1 file complete and independent definition of the Apache Lucene 2.9 file
formats. formats.
</p> </p>
@ -367,7 +367,7 @@
</tr> </tr>
<tr> <tr>
<td><a href="#Normalization Factors">Norms</a></td> <td><a href="#Normalization Factors">Norms</a></td>
<td>.nrm (pre 2.1: .f[0-9]*)</td> <td>.nrm</td>
<td>Encodes length and boost factors for docs and fields</td> <td>Encodes length and boost factors for docs and fields</td>
</tr> </tr>
<tr> <tr>
@ -903,32 +903,8 @@
-2), followed by the generation recorded as Int64, -2), followed by the generation recorded as Int64,
written twice. written twice.
</p> </p>
<p> <p>
<b>Pre-2.1:</b> <b>2.9</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize&gt;
<sup>SegCount</sup>
</p>
<p>
<b>2.1 and above:</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize, DelGen, HasSingleNormFile, NumField,
NormGen<sup>NumField</sup>,
IsCompoundFile&gt;<sup>SegCount</sup>
</p>
<p>
<b>2.3:</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize, DelGen, DocStoreOffset, [DocStoreSegment, DocStoreIsCompoundFile], HasSingleNormFile, NumField,
NormGen<sup>NumField</sup>,
IsCompoundFile&gt;<sup>SegCount</sup>
</p>
<p>
<b>2.4 and above:</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize, DelGen, DocStoreOffset, [DocStoreSegment, DocStoreIsCompoundFile], HasSingleNormFile, NumField,
NormGen<sup>NumField</sup>,
IsCompoundFile, DeletionCount, HasProx&gt;<sup>SegCount</sup>, Checksum
</p>
<p>
<b>2.9 and above:</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize, DelGen, DocStoreOffset, [DocStoreSegment, DocStoreIsCompoundFile], HasSingleNormFile, NumField, Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize, DelGen, DocStoreOffset, [DocStoreSegment, DocStoreIsCompoundFile], HasSingleNormFile, NumField,
NormGen<sup>NumField</sup>, NormGen<sup>NumField</sup>,
IsCompoundFile, DeletionCount, HasProx, Diagnostics&gt;<sup>SegCount</sup>, CommitUserData, Checksum IsCompoundFile, DeletionCount, HasProx, Diagnostics&gt;<sup>SegCount</sup>, CommitUserData, Checksum
@ -961,7 +937,7 @@
</p> </p>
<p> <p>
Format is -1 as of Lucene 1.4, -3 (SegmentInfos.FORMAT_SINGLE_NORM_FILE) as of Lucene 2.1 and 2.2, -4 (SegmentInfos.FORMAT_SHARED_DOC_STORE) as of Lucene 2.3, -7 (SegmentInfos.FORMAT_HAS_PROX) as of Lucene 2.4, and -9 (SegmentInfos.FORMAT_DIAGNOSTICS) as of Lucene 2.9. Format is -9 (SegmentInfos.FORMAT_DIAGNOSTICS).
</p> </p>
<p> <p>
@ -1092,20 +1068,12 @@
documents). This lock file ensures that only one documents). This lock file ensures that only one
writer is modifying the index at a time. writer is modifying the index at a time.
</p> </p>
<p>
Note that prior to version 2.1, Lucene also used a
commit lock. This was removed in 2.1.
</p>
</section> </section>
<section id="Deletable File"><title>Deletable File</title> <section id="Deletable File"><title>Deletable File</title>
<p> <p>
Prior to Lucene 2.1 there was a file "deletable" A writer dynamically computes
that contained details about files that need to be
deleted. As of 2.1, a writer dynamically computes
the files that are deletable, instead, so no file the files that are deletable, instead, so no file
is written. is written.
</p> </p>
@ -1193,9 +1161,6 @@
bit is one for fields that have term vectors stored, and zero for fields bit is one for fields that have term vectors stored, and zero for fields
without term vectors. without term vectors.
</li> </li>
<p>
<b>Lucene &gt;= 1.9:</b>
</p>
<li>If the third lowest-order bit is set (0x04), term positions are stored with the term vectors.</li> <li>If the third lowest-order bit is set (0x04), term positions are stored with the term vectors.</li>
<li>If the fourth lowest-order bit is set (0x08), term offsets are stored with the term vectors.</li> <li>If the fourth lowest-order bit is set (0x08), term offsets are stored with the term vectors.</li>
<li>If the fifth lowest-order bit is set (0x10), norms are omitted for the indexed field.</li> <li>If the fifth lowest-order bit is set (0x10), norms are omitted for the indexed field.</li>
@ -1286,22 +1251,6 @@
<p>FieldNum --&gt; <p>FieldNum --&gt;
VInt VInt
</p> </p>
<p>
<b>Lucene &lt;= 1.4:</b>
</p>
<p>Bits --&gt;
Byte
</p>
<p>Value --&gt;
String
</p>
<p>Only the low-order bit of Bits is used. It is one for
tokenized fields, and zero for non-tokenized fields.
</p>
<p>
<b>Lucene &gt;= 1.9:</b>
</p>
<p>Bits --&gt; <p>Bits --&gt;
Byte Byte
</p> </p>
@ -1383,7 +1332,7 @@
UTF16 character code) by the term's text. UTF16 character code) by the term's text.
</p> </p>
<p>TIVersion names the version of the format <p>TIVersion names the version of the format
of this file and is -2 in Lucene 1.4. of this file and is equal to TermInfosWriter.FORMAT_CURRENT.
</p> </p>
<p>Term <p>Term
text prefixes are shared. The PrefixLength is the number of initial text prefixes are shared. The PrefixLength is the number of initial
@ -1592,7 +1541,7 @@
<sup>nd</sup> <sup>nd</sup>
starts. starts.
</p> </p>
<p>Lucene 2.2 introduces the notion of skip levels. Each term can have multiple skip levels. <p>Each term can have multiple skip levels.
The amount of skip levels for a term is NumSkipLevels = Min(MaxSkipLevels, floor(log(DocFreq/log(SkipInterval)))). The amount of skip levels for a term is NumSkipLevels = Min(MaxSkipLevels, floor(log(DocFreq/log(SkipInterval)))).
The number of SkipData entries for a skip level is DocFreq/(SkipInterval^(Level + 1)), whereas the lowest skip The number of SkipData entries for a skip level is DocFreq/(SkipInterval^(Level + 1)), whereas the lowest skip
level is Level=0. <br></br> level is Level=0. <br></br>
@ -1674,20 +1623,8 @@
</p> </p>
</section> </section>
<section id="Normalization Factors"><title>Normalization Factors</title> <section id="Normalization Factors"><title>Normalization Factors</title>
<p>
<b>Pre-2.1:</b> <p>There's a single .nrm file containing all norms:
There's a norm file for each indexed field with a byte for
each document. The .f[0-9]* file contains,
for each document, a byte that encodes a value that is multiplied
into the score for hits on that field:
</p>
<p>Norms
(.f[0-9]*) --&gt; &lt;Byte&gt;
<sup>SegSize</sup>
</p>
<p>
<b>2.1 and above:</b>
There's a single .nrm file containing all norms:
</p> </p>
<p>AllNorms <p>AllNorms
(.nrm) --&gt; NormsHeader,&lt;Norms&gt; (.nrm) --&gt; NormsHeader,&lt;Norms&gt;
@ -1745,13 +1682,7 @@
When field <em>N</em> is modified, a separate norm file <em>.sN</em> When field <em>N</em> is modified, a separate norm file <em>.sN</em>
is created, to maintain the norm values for that field. is created, to maintain the norm values for that field.
</p> </p>
<p> <p>Separate norm files are created (when adequate) for both compound and non compound segments.
<b>Pre-2.1:</b>
Separate norm files are created only for compound segments.
</p>
<p>
<b>2.1 and above:</b>
Separate norm files are created (when adequate) for both compound and non compound segments.
</p> </p>
</section> </section>
@ -1770,7 +1701,7 @@
<p>DocumentIndex (.tvx) --&gt; TVXVersion&lt;DocumentPosition,FieldPosition&gt; <p>DocumentIndex (.tvx) --&gt; TVXVersion&lt;DocumentPosition,FieldPosition&gt;
<sup>NumDocs</sup> <sup>NumDocs</sup>
</p> </p>
<p>TVXVersion --&gt; Int (3 (TermVectorsReader.FORMAT_VERSION2) for Lucene 2.4)</p> <p>TVXVersion --&gt; Int (TermVectorsReader.CURRENT)</p>
<p>DocumentPosition --&gt; UInt64 (offset in <p>DocumentPosition --&gt; UInt64 (offset in
the .tvd file)</p> the .tvd file)</p>
<p>FieldPosition --&gt; UInt64 (offset in the <p>FieldPosition --&gt; UInt64 (offset in the
@ -1785,7 +1716,7 @@
Document (.tvd) --&gt; TVDVersion&lt;NumFields, FieldNums, FieldPositions&gt; Document (.tvd) --&gt; TVDVersion&lt;NumFields, FieldNums, FieldPositions&gt;
<sup>NumDocs</sup> <sup>NumDocs</sup>
</p> </p>
<p>TVDVersion --&gt; Int (3 (TermVectorsReader.FORMAT_VERSION2) for Lucene 2.4)</p> <p>TVDVersion --&gt; Int (TermVectorsReader.FORMAT_CURRENT)</p>
<p>NumFields --&gt; VInt</p> <p>NumFields --&gt; VInt</p>
<p>FieldNums --&gt; &lt;FieldNumDelta&gt; <p>FieldNums --&gt; &lt;FieldNumDelta&gt;
<sup>NumFields</sup> <sup>NumFields</sup>
@ -1805,7 +1736,7 @@
<p>Field (.tvf) --&gt; TVFVersion&lt;NumTerms, Position/Offset, TermFreqs&gt; <p>Field (.tvf) --&gt; TVFVersion&lt;NumTerms, Position/Offset, TermFreqs&gt;
<sup>NumFields</sup> <sup>NumFields</sup>
</p> </p>
<p>TVFVersion --&gt; Int (3 (TermVectorsReader.FORMAT_VERSION2) for Lucene 2.4)</p> <p>TVFVersion --&gt; Int (TermVectorsReader.FORMAT_CURRENT)</p>
<p>NumTerms --&gt; VInt</p> <p>NumTerms --&gt; VInt</p>
<p>Position/Offset --&gt; Byte</p> <p>Position/Offset --&gt; Byte</p>
<p>TermFreqs --&gt; &lt;TermText, TermFreq, Positions?, Offsets?&gt; <p>TermFreqs --&gt; &lt;TermText, TermFreq, Positions?, Offsets?&gt;
@ -1845,15 +1776,7 @@
<p>Although per-segment, this file is maintained exterior to compound segment files. <p>Although per-segment, this file is maintained exterior to compound segment files.
</p> </p>
<p> <p>
<b>Pre-2.1:</b>
Deletions
(.del) --&gt; ByteCount,BitCount,Bits
</p>
<p>
<b>2.1 and above:</b>
Deletions Deletions
(.del) --&gt; [Format],ByteCount,BitCount, Bits | DGaps (depending on Format) (.del) --&gt; [Format],ByteCount,BitCount, Bits | DGaps (depending on Format)
</p> </p>