mirror of https://github.com/apache/lucene.git
Lockless commits
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@476359 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
bd6f012511
commit
d634ccf4e9
|
@ -104,6 +104,15 @@ API Changes
|
|||
9. LUCENE-657: Made FuzzyQuery non-final and inner ScoreTerm protected.
|
||||
(Steven Parkes via Otis Gospodnetic)
|
||||
|
||||
10. LUCENE-701: Lockless commits: a commit lock is no longer required
|
||||
when a writer commits and a reader opens the index. This includes
|
||||
a change to the index file format (see docs/fileformats.html for
|
||||
details). It also removes all APIs associated with the commit
|
||||
lock & its timeout. Readers are now truly read-only and do not
|
||||
block one another on startup. This is the first step to getting
|
||||
Lucene to work correctly over NFS (second step is
|
||||
LUCENE-710). (Mike McCandless)
|
||||
|
||||
Bug fixes
|
||||
|
||||
1. Fixed the web application demo (built with "ant war-demo") which
|
||||
|
|
|
@ -118,7 +118,7 @@ limitations under the License.
|
|||
<blockquote>
|
||||
<p>
|
||||
This document defines the index file formats used
|
||||
in Lucene version 2.0. If you are using a different
|
||||
in Lucene version 2.1. If you are using a different
|
||||
version of Lucene, please consult the copy of
|
||||
<code>docs/fileformats.html</code> that was distributed
|
||||
with the version you are using.
|
||||
|
@ -143,6 +143,17 @@ limitations under the License.
|
|||
Compatibility notes are provided in this document,
|
||||
describing how file formats have changed from prior versions.
|
||||
</p>
|
||||
<p>
|
||||
In version 2.1, the file format was changed to allow
|
||||
lock-less commits (ie, no more commit lock). The
|
||||
change is fully backwards compatible: you can open a
|
||||
pre-2.1 index for searching or adding/deleting of
|
||||
docs. When the new segments file is saved
|
||||
(committed), it will be written in the new file format
|
||||
(meaning no specific "upgrade" process is needed).
|
||||
But note that once a commit has occurred, pre-2.1
|
||||
Lucene will not be able to read the index.
|
||||
</p>
|
||||
</blockquote>
|
||||
</p>
|
||||
</td></tr>
|
||||
|
@ -403,6 +414,17 @@ limitations under the License.
|
|||
Typically, all segments
|
||||
in an index are stored in a single directory, although this is not
|
||||
required.
|
||||
</p>
|
||||
<p>
|
||||
As of version 2.1 (lock-less commits), file names are
|
||||
never re-used (there is one exception, "segments.gen",
|
||||
see below). That is, when any file is saved to the
|
||||
Directory it is given a never before used filename.
|
||||
This is achieved using a simple generations approach.
|
||||
For example, the first segments file is segments_1,
|
||||
then segments_2, etc. The generation is a sequential
|
||||
long integer represented in alpha-numeric (base 36)
|
||||
form.
|
||||
</p>
|
||||
</blockquote>
|
||||
</p>
|
||||
|
@ -1080,25 +1102,53 @@ limitations under the License.
|
|||
<blockquote>
|
||||
<p>
|
||||
The active segments in the index are stored in the
|
||||
segment info file. An index only has
|
||||
a single file in this format, and it is named "segments".
|
||||
This lists each segment by name, and also contains the size of each
|
||||
segment.
|
||||
segment info file, <tt>segments_N</tt>. There may
|
||||
be one or more <tt>segments_N</tt> files in the
|
||||
index; however, the one with the largest
|
||||
generation is the active one (when older
|
||||
segments_N files are present it's because they
|
||||
temporarily cannot be deleted, or, a writer is in
|
||||
the process of committing). This file lists each
|
||||
segment by name, has details about the separate
|
||||
norms and deletion files, and also contains the
|
||||
size of each segment.
|
||||
</p>
|
||||
<p>
|
||||
As of 2.1, there is also a file
|
||||
<tt>segments.gen</tt>. This file contains the
|
||||
current generation (the <tt>_N</tt> in
|
||||
<tt>segments_N</tt>) of the index. This is
|
||||
used only as a fallback in case the current
|
||||
generation cannot be accurately determined by
|
||||
directory listing alone (as is the case for some
|
||||
NFS clients with time-based directory cache
|
||||
expiraation). This file simply contains an Int32
|
||||
version header (SegmentInfos.FORMAT_LOCKLESS =
|
||||
-2), followed by the generation recorded as Int64,
|
||||
written twice.
|
||||
</p>
|
||||
<p>
|
||||
<b>Pre-2.1:</b>
|
||||
Segments --> Format, Version, NameCounter, SegCount, <SegName, SegSize><sup>SegCount</sup>
|
||||
</p>
|
||||
<p>
|
||||
Format, NameCounter, SegCount, SegSize --> UInt32
|
||||
<b>2.1 and above:</b>
|
||||
Segments --> Format, Version, NameCounter, SegCount, <SegName, SegSize, DelGen, NumField, NormGen<sup>NumField</sup> ><sup>SegCount</sup>, IsCompoundFile
|
||||
</p>
|
||||
<p>
|
||||
Version --> UInt64
|
||||
Format, NameCounter, SegCount, SegSize, NumField --> Int32
|
||||
</p>
|
||||
<p>
|
||||
Version, DelGen, NormGen --> Int64
|
||||
</p>
|
||||
<p>
|
||||
SegName --> String
|
||||
</p>
|
||||
<p>
|
||||
Format is -1 in Lucene 1.4.
|
||||
IsCompoundFile --> Int8
|
||||
</p>
|
||||
<p>
|
||||
Format is -1 as of Lucene 1.4 and -2 as of Lucene 2.1.
|
||||
</p>
|
||||
<p>
|
||||
Version counts how often the index has been
|
||||
|
@ -1113,6 +1163,35 @@ limitations under the License.
|
|||
</p>
|
||||
<p>
|
||||
SegSize is the number of documents contained in the segment index.
|
||||
</p>
|
||||
<p>
|
||||
DelGen is the generation count of the separate
|
||||
deletes file. If this is -1, there are no
|
||||
separate deletes. If it is 0, this is a pre-2.1
|
||||
segment and you must check filesystem for the
|
||||
existence of _X.del. Anything above zero means
|
||||
there are separate deletes (_X_N.del).
|
||||
</p>
|
||||
<p>
|
||||
NumField is the size of the array for NormGen, or
|
||||
-1 if there are no NormGens stored.
|
||||
</p>
|
||||
<p>
|
||||
NormGen records the generation of the separate
|
||||
norms files. If NumField is -1, there are no
|
||||
normGens stored and they are all assumed to be 0
|
||||
when the segment file was written pre-2.1 and all
|
||||
assumed to be -1 when the segments file is 2.1 or
|
||||
above. The generation then has the same meaning
|
||||
as delGen (above).
|
||||
</p>
|
||||
<p>
|
||||
IsCompoundFile records whether the segment is
|
||||
written as a compound file or not. If this is -1,
|
||||
the segment is not a compound file. If it is 1,
|
||||
the segment is a compound file. Else it is 0,
|
||||
which means we check filesystem to see if _X.cfs
|
||||
exists.
|
||||
</p>
|
||||
</blockquote>
|
||||
</td></tr>
|
||||
|
@ -1121,42 +1200,31 @@ limitations under the License.
|
|||
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
||||
<tr><td bgcolor="#828DA6">
|
||||
<font color="#ffffff" face="arial,helvetica,sanserif">
|
||||
<a name="Lock Files"><strong>Lock Files</strong></a>
|
||||
<a name="Lock File"><strong>Lock File</strong></a>
|
||||
</font>
|
||||
</td></tr>
|
||||
<tr><td>
|
||||
<blockquote>
|
||||
<p>
|
||||
Several files are used to indicate that another
|
||||
process is using an index. Note that these files are not
|
||||
A write lock is used to indicate that another
|
||||
process is writing to the index. Note that this file is not
|
||||
stored in the index directory itself, but rather in the
|
||||
system's temporary directory, as indicated in the Java
|
||||
system property "java.io.tmpdir".
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
<p>
|
||||
When a file named "commit.lock"
|
||||
is present, a process is currently re-writing the "segments"
|
||||
file and deleting outdated segment index files, or a process is
|
||||
reading the "segments"
|
||||
file and opening the files of the segments it names. This lock file
|
||||
prevents files from being deleted by another process after a process
|
||||
has read the "segments"
|
||||
file but before it has managed to open all of the files of the
|
||||
segments named therein.
|
||||
</p>
|
||||
</li>
|
||||
|
||||
<li>
|
||||
<p>
|
||||
When a file named "write.lock"
|
||||
is present, a process is currently adding documents to an index, or
|
||||
removing files from that index. This lock file prevents several
|
||||
processes from attempting to modify an index at the same time.
|
||||
</p>
|
||||
</li>
|
||||
</ul>
|
||||
<p>
|
||||
The write lock is named "XXXX-write.lock" where
|
||||
XXXX is typically a unique prefix computed by the
|
||||
directory path to the index. When this file is
|
||||
present, a process is currently adding documents
|
||||
to an index, or removing files from that index.
|
||||
This lock file prevents several processes from
|
||||
attempting to modify an index at the same time.
|
||||
</p>
|
||||
<p>
|
||||
Note that prior to version 2.1, Lucene also used a
|
||||
commit lock. This was removed in 2.1.
|
||||
</p>
|
||||
</blockquote>
|
||||
</td></tr>
|
||||
<tr><td><br/></td></tr>
|
||||
|
@ -1170,20 +1238,11 @@ limitations under the License.
|
|||
<tr><td>
|
||||
<blockquote>
|
||||
<p>
|
||||
A file named "deletable"
|
||||
contains the names of files that are no longer used by the index, but
|
||||
which could not be deleted. This is only used on Win32, where a
|
||||
file may not be deleted while it is still open. On other platforms
|
||||
the file contains only null bytes.
|
||||
</p>
|
||||
<p>
|
||||
Deletable --> DeletableCount,
|
||||
<DelableName><sup>DeletableCount</sup>
|
||||
</p>
|
||||
<p>DeletableCount --> UInt32
|
||||
</p>
|
||||
<p>DeletableName -->
|
||||
String
|
||||
Prior to Lucene 2.1 there was a file "deletable"
|
||||
that contained details about files that need to be
|
||||
deleted. As of 2.1, a writer dynamically computes
|
||||
the files that are deletable, instead, so no file
|
||||
is written.
|
||||
</p>
|
||||
</blockquote>
|
||||
</td></tr>
|
||||
|
|
|
@ -0,0 +1,219 @@
|
|||
package org.apache.lucene.index;
|
||||
|
||||
import org.apache.lucene.index.IndexFileNames;
|
||||
import org.apache.lucene.index.IndexFileNameFilter;
|
||||
import org.apache.lucene.index.SegmentInfos;
|
||||
import org.apache.lucene.store.Directory;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.io.PrintStream;
|
||||
import java.util.Vector;
|
||||
import java.util.HashMap;
|
||||
|
||||
/**
|
||||
* A utility class (used by both IndexReader and
|
||||
* IndexWriter) to keep track of files that need to be
|
||||
* deleted because they are no longer referenced by the
|
||||
* index.
|
||||
*/
|
||||
public class IndexFileDeleter {
|
||||
private Vector deletable;
|
||||
private Vector pending;
|
||||
private Directory directory;
|
||||
private SegmentInfos segmentInfos;
|
||||
private PrintStream infoStream;
|
||||
|
||||
public IndexFileDeleter(SegmentInfos segmentInfos, Directory directory)
|
||||
throws IOException {
|
||||
this.segmentInfos = segmentInfos;
|
||||
this.directory = directory;
|
||||
}
|
||||
|
||||
void setInfoStream(PrintStream infoStream) {
|
||||
this.infoStream = infoStream;
|
||||
}
|
||||
|
||||
/** Determine index files that are no longer referenced
|
||||
* and therefore should be deleted. This is called once
|
||||
* (by the writer), and then subsequently we add onto
|
||||
* deletable any files that are no longer needed at the
|
||||
* point that we create the unused file (eg when merging
|
||||
* segments), and we only remove from deletable when a
|
||||
* file is successfully deleted.
|
||||
*/
|
||||
|
||||
public void findDeletableFiles() throws IOException {
|
||||
|
||||
// Gather all "current" segments:
|
||||
HashMap current = new HashMap();
|
||||
for(int j=0;j<segmentInfos.size();j++) {
|
||||
SegmentInfo segmentInfo = (SegmentInfo) segmentInfos.elementAt(j);
|
||||
current.put(segmentInfo.name, segmentInfo);
|
||||
}
|
||||
|
||||
// Then go through all files in the Directory that are
|
||||
// Lucene index files, and add to deletable if they are
|
||||
// not referenced by the current segments info:
|
||||
|
||||
String segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
|
||||
IndexFileNameFilter filter = IndexFileNameFilter.getFilter();
|
||||
|
||||
String[] files = directory.list();
|
||||
|
||||
for (int i = 0; i < files.length; i++) {
|
||||
|
||||
if (filter.accept(null, files[i]) && !files[i].equals(segmentsInfosFileName) && !files[i].equals(IndexFileNames.SEGMENTS_GEN)) {
|
||||
|
||||
String segmentName;
|
||||
String extension;
|
||||
|
||||
// First remove any extension:
|
||||
int loc = files[i].indexOf('.');
|
||||
if (loc != -1) {
|
||||
extension = files[i].substring(1+loc);
|
||||
segmentName = files[i].substring(0, loc);
|
||||
} else {
|
||||
extension = null;
|
||||
segmentName = files[i];
|
||||
}
|
||||
|
||||
// Then, remove any generation count:
|
||||
loc = segmentName.indexOf('_', 1);
|
||||
if (loc != -1) {
|
||||
segmentName = segmentName.substring(0, loc);
|
||||
}
|
||||
|
||||
// Delete this file if it's not a "current" segment,
|
||||
// or, it is a single index file but there is now a
|
||||
// corresponding compound file:
|
||||
boolean doDelete = false;
|
||||
|
||||
if (!current.containsKey(segmentName)) {
|
||||
// Delete if segment is not referenced:
|
||||
doDelete = true;
|
||||
} else {
|
||||
// OK, segment is referenced, but file may still
|
||||
// be orphan'd:
|
||||
SegmentInfo info = (SegmentInfo) current.get(segmentName);
|
||||
|
||||
if (filter.isCFSFile(files[i]) && info.getUseCompoundFile()) {
|
||||
// This file is in fact stored in a CFS file for
|
||||
// this segment:
|
||||
doDelete = true;
|
||||
} else {
|
||||
|
||||
if ("del".equals(extension)) {
|
||||
// This is a _segmentName_N.del file:
|
||||
if (!files[i].equals(info.getDelFileName())) {
|
||||
// If this is a seperate .del file, but it
|
||||
// doesn't match the current del filename for
|
||||
// this segment, then delete it:
|
||||
doDelete = true;
|
||||
}
|
||||
} else if (extension != null && extension.startsWith("s") && extension.matches("s\\d+")) {
|
||||
int field = Integer.parseInt(extension.substring(1));
|
||||
// This is a _segmentName_N.sX file:
|
||||
if (!files[i].equals(info.getNormFileName(field))) {
|
||||
// This is an orphan'd separate norms file:
|
||||
doDelete = true;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (doDelete) {
|
||||
addDeletableFile(files[i]);
|
||||
if (infoStream != null) {
|
||||
infoStream.println("IndexFileDeleter: file \"" + files[i] + "\" is unreferenced in index and will be deleted on next commit");
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Some operating systems (e.g. Windows) don't permit a file to be deleted
|
||||
* while it is opened for read (e.g. by another process or thread). So we
|
||||
* assume that when a delete fails it is because the file is open in another
|
||||
* process, and queue the file for subsequent deletion.
|
||||
*/
|
||||
|
||||
public final void deleteSegments(Vector segments) throws IOException {
|
||||
|
||||
deleteFiles(); // try to delete files that we couldn't before
|
||||
|
||||
for (int i = 0; i < segments.size(); i++) {
|
||||
SegmentReader reader = (SegmentReader)segments.elementAt(i);
|
||||
if (reader.directory() == this.directory)
|
||||
deleteFiles(reader.files()); // try to delete our files
|
||||
else
|
||||
deleteFiles(reader.files(), reader.directory()); // delete other files
|
||||
}
|
||||
}
|
||||
|
||||
public final void deleteFiles(Vector files, Directory directory)
|
||||
throws IOException {
|
||||
for (int i = 0; i < files.size(); i++)
|
||||
directory.deleteFile((String)files.elementAt(i));
|
||||
}
|
||||
|
||||
public final void deleteFiles(Vector files)
|
||||
throws IOException {
|
||||
deleteFiles(); // try to delete files that we couldn't before
|
||||
for (int i = 0; i < files.size(); i++) {
|
||||
deleteFile((String) files.elementAt(i));
|
||||
}
|
||||
}
|
||||
|
||||
public final void deleteFile(String file)
|
||||
throws IOException {
|
||||
try {
|
||||
directory.deleteFile(file); // try to delete each file
|
||||
} catch (IOException e) { // if delete fails
|
||||
if (directory.fileExists(file)) {
|
||||
if (infoStream != null)
|
||||
infoStream.println("IndexFileDeleter: unable to remove file \"" + file + "\": " + e.toString() + "; Will re-try later.");
|
||||
addDeletableFile(file); // add to deletable
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
final void clearPendingFiles() {
|
||||
pending = null;
|
||||
}
|
||||
|
||||
final void addPendingFile(String fileName) {
|
||||
if (pending == null) {
|
||||
pending = new Vector();
|
||||
}
|
||||
pending.addElement(fileName);
|
||||
}
|
||||
|
||||
final void commitPendingFiles() {
|
||||
if (pending != null) {
|
||||
if (deletable == null) {
|
||||
deletable = pending;
|
||||
pending = null;
|
||||
} else {
|
||||
deletable.addAll(pending);
|
||||
pending = null;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
public final void addDeletableFile(String fileName) {
|
||||
if (deletable == null) {
|
||||
deletable = new Vector();
|
||||
}
|
||||
deletable.addElement(fileName);
|
||||
}
|
||||
|
||||
public final void deleteFiles()
|
||||
throws IOException {
|
||||
if (deletable != null) {
|
||||
Vector oldDeletable = deletable;
|
||||
deletable = null;
|
||||
deleteFiles(oldDeletable); // try to delete deletable
|
||||
}
|
||||
}
|
||||
}
|
|
@ -19,6 +19,7 @@ package org.apache.lucene.index;
|
|||
|
||||
import java.io.File;
|
||||
import java.io.FilenameFilter;
|
||||
import java.util.HashSet;
|
||||
|
||||
/**
|
||||
* Filename filter that accept filenames and extensions only created by Lucene.
|
||||
|
@ -28,18 +29,64 @@ import java.io.FilenameFilter;
|
|||
*/
|
||||
public class IndexFileNameFilter implements FilenameFilter {
|
||||
|
||||
static IndexFileNameFilter singleton = new IndexFileNameFilter();
|
||||
private HashSet extensions;
|
||||
|
||||
public IndexFileNameFilter() {
|
||||
extensions = new HashSet();
|
||||
for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) {
|
||||
extensions.add(IndexFileNames.INDEX_EXTENSIONS[i]);
|
||||
}
|
||||
}
|
||||
|
||||
/* (non-Javadoc)
|
||||
* @see java.io.FilenameFilter#accept(java.io.File, java.lang.String)
|
||||
*/
|
||||
public boolean accept(File dir, String name) {
|
||||
for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) {
|
||||
if (name.endsWith("."+IndexFileNames.INDEX_EXTENSIONS[i]))
|
||||
int i = name.lastIndexOf('.');
|
||||
if (i != -1) {
|
||||
String extension = name.substring(1+i);
|
||||
if (extensions.contains(extension)) {
|
||||
return true;
|
||||
} else if (extension.startsWith("f") &&
|
||||
extension.matches("f\\d+")) {
|
||||
return true;
|
||||
} else if (extension.startsWith("s") &&
|
||||
extension.matches("s\\d+")) {
|
||||
return true;
|
||||
}
|
||||
} else {
|
||||
if (name.equals(IndexFileNames.DELETABLE)) return true;
|
||||
else if (name.startsWith(IndexFileNames.SEGMENTS)) return true;
|
||||
}
|
||||
if (name.equals(IndexFileNames.DELETABLE)) return true;
|
||||
else if (name.equals(IndexFileNames.SEGMENTS)) return true;
|
||||
else if (name.matches(".+\\.f\\d+")) return true;
|
||||
return false;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns true if this is a file that would be contained
|
||||
* in a CFS file. This function should only be called on
|
||||
* files that pass the above "accept" (ie, are already
|
||||
* known to be a Lucene index file).
|
||||
*/
|
||||
public boolean isCFSFile(String name) {
|
||||
int i = name.lastIndexOf('.');
|
||||
if (i != -1) {
|
||||
String extension = name.substring(1+i);
|
||||
if (extensions.contains(extension) &&
|
||||
!extension.equals("del") &&
|
||||
!extension.equals("gen") &&
|
||||
!extension.equals("cfs")) {
|
||||
return true;
|
||||
}
|
||||
if (extension.startsWith("f") &&
|
||||
extension.matches("f\\d+")) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
public static IndexFileNameFilter getFilter() {
|
||||
return singleton;
|
||||
}
|
||||
}
|
||||
|
|
|
@ -27,19 +27,25 @@ final class IndexFileNames {
|
|||
|
||||
/** Name of the index segment file */
|
||||
static final String SEGMENTS = "segments";
|
||||
|
||||
/** Name of the generation reference file name */
|
||||
static final String SEGMENTS_GEN = "segments.gen";
|
||||
|
||||
/** Name of the index deletable file */
|
||||
/** Name of the index deletable file (only used in
|
||||
* pre-lockless indices) */
|
||||
static final String DELETABLE = "deletable";
|
||||
|
||||
|
||||
/**
|
||||
* This array contains all filename extensions used by Lucene's index files, with
|
||||
* one exception, namely the extension made up from <code>.f</code> + a number.
|
||||
* Also note that two of Lucene's files (<code>deletable</code> and
|
||||
* <code>segments</code>) don't have any filename extension.
|
||||
* This array contains all filename extensions used by
|
||||
* Lucene's index files, with two exceptions, namely the
|
||||
* extension made up from <code>.f</code> + a number and
|
||||
* from <code>.s</code> + a number. Also note that
|
||||
* Lucene's <code>segments_N</code> files do not have any
|
||||
* filename extension.
|
||||
*/
|
||||
static final String INDEX_EXTENSIONS[] = new String[] {
|
||||
"cfs", "fnm", "fdx", "fdt", "tii", "tis", "frq", "prx", "del",
|
||||
"tvx", "tvd", "tvf", "tvp" };
|
||||
"tvx", "tvd", "tvf", "tvp", "gen"};
|
||||
|
||||
/** File extensions of old-style index files */
|
||||
static final String COMPOUND_EXTENSIONS[] = new String[] {
|
||||
|
@ -50,5 +56,24 @@ final class IndexFileNames {
|
|||
static final String VECTOR_EXTENSIONS[] = new String[] {
|
||||
"tvx", "tvd", "tvf"
|
||||
};
|
||||
|
||||
|
||||
/**
|
||||
* Computes the full file name from base, extension and
|
||||
* generation. If the generation is -1, the file name is
|
||||
* null. If it's 0, the file name is <base><extension>.
|
||||
* If it's > 0, the file name is <base>_<generation><extension>.
|
||||
*
|
||||
* @param base -- main part of the file name
|
||||
* @param extension -- extension of the filename (including .)
|
||||
* @param gen -- generation
|
||||
*/
|
||||
public static final String fileNameFromGeneration(String base, String extension, long gen) {
|
||||
if (gen == -1) {
|
||||
return null;
|
||||
} else if (gen == 0) {
|
||||
return base + extension;
|
||||
} else {
|
||||
return base + "_" + Long.toString(gen, Character.MAX_RADIX) + extension;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
@ -113,6 +113,7 @@ public abstract class IndexReader {
|
|||
private Directory directory;
|
||||
private boolean directoryOwner;
|
||||
private boolean closeDirectory;
|
||||
protected IndexFileDeleter deleter;
|
||||
|
||||
private SegmentInfos segmentInfos;
|
||||
private Lock writeLock;
|
||||
|
@ -138,24 +139,40 @@ public abstract class IndexReader {
|
|||
}
|
||||
|
||||
private static IndexReader open(final Directory directory, final boolean closeDirectory) throws IOException {
|
||||
synchronized (directory) { // in- & inter-process sync
|
||||
return (IndexReader)new Lock.With(
|
||||
directory.makeLock(IndexWriter.COMMIT_LOCK_NAME),
|
||||
IndexWriter.COMMIT_LOCK_TIMEOUT) {
|
||||
public Object doBody() throws IOException {
|
||||
SegmentInfos infos = new SegmentInfos();
|
||||
infos.read(directory);
|
||||
if (infos.size() == 1) { // index is optimized
|
||||
return SegmentReader.get(infos, infos.info(0), closeDirectory);
|
||||
}
|
||||
IndexReader[] readers = new IndexReader[infos.size()];
|
||||
for (int i = 0; i < infos.size(); i++)
|
||||
readers[i] = SegmentReader.get(infos.info(i));
|
||||
return new MultiReader(directory, infos, closeDirectory, readers);
|
||||
|
||||
return (IndexReader) new SegmentInfos.FindSegmentsFile(directory) {
|
||||
|
||||
public Object doBody(String segmentFileName) throws IOException {
|
||||
|
||||
SegmentInfos infos = new SegmentInfos();
|
||||
infos.read(directory, segmentFileName);
|
||||
|
||||
if (infos.size() == 1) { // index is optimized
|
||||
return SegmentReader.get(infos, infos.info(0), closeDirectory);
|
||||
} else {
|
||||
|
||||
// To reduce the chance of hitting FileNotFound
|
||||
// (and having to retry), we open segments in
|
||||
// reverse because IndexWriter merges & deletes
|
||||
// the newest segments first.
|
||||
|
||||
IndexReader[] readers = new IndexReader[infos.size()];
|
||||
for (int i = infos.size()-1; i >= 0; i--) {
|
||||
try {
|
||||
readers[i] = SegmentReader.get(infos.info(i));
|
||||
} catch (IOException e) {
|
||||
// Close all readers we had opened:
|
||||
for(i++;i<infos.size();i++) {
|
||||
readers[i].close();
|
||||
}
|
||||
throw e;
|
||||
}
|
||||
}
|
||||
}.run();
|
||||
}
|
||||
|
||||
return new MultiReader(directory, infos, closeDirectory, readers);
|
||||
}
|
||||
}
|
||||
}.run();
|
||||
}
|
||||
|
||||
/** Returns the directory this index resides in. */
|
||||
|
@ -175,8 +192,12 @@ public abstract class IndexReader {
|
|||
* Do not use this to check whether the reader is still up-to-date, use
|
||||
* {@link #isCurrent()} instead.
|
||||
*/
|
||||
public static long lastModified(File directory) throws IOException {
|
||||
return FSDirectory.fileModified(directory, IndexFileNames.SEGMENTS);
|
||||
public static long lastModified(File fileDirectory) throws IOException {
|
||||
return ((Long) new SegmentInfos.FindSegmentsFile(fileDirectory) {
|
||||
public Object doBody(String segmentFileName) {
|
||||
return new Long(FSDirectory.fileModified(fileDirectory, segmentFileName));
|
||||
}
|
||||
}.run()).longValue();
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -184,8 +205,12 @@ public abstract class IndexReader {
|
|||
* Do not use this to check whether the reader is still up-to-date, use
|
||||
* {@link #isCurrent()} instead.
|
||||
*/
|
||||
public static long lastModified(Directory directory) throws IOException {
|
||||
return directory.fileModified(IndexFileNames.SEGMENTS);
|
||||
public static long lastModified(final Directory directory2) throws IOException {
|
||||
return ((Long) new SegmentInfos.FindSegmentsFile(directory2) {
|
||||
public Object doBody(String segmentFileName) throws IOException {
|
||||
return new Long(directory2.fileModified(segmentFileName));
|
||||
}
|
||||
}.run()).longValue();
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -227,21 +252,7 @@ public abstract class IndexReader {
|
|||
* @throws IOException if segments file cannot be read.
|
||||
*/
|
||||
public static long getCurrentVersion(Directory directory) throws IOException {
|
||||
synchronized (directory) { // in- & inter-process sync
|
||||
Lock commitLock=directory.makeLock(IndexWriter.COMMIT_LOCK_NAME);
|
||||
|
||||
boolean locked=false;
|
||||
|
||||
try {
|
||||
locked=commitLock.obtain(IndexWriter.COMMIT_LOCK_TIMEOUT);
|
||||
|
||||
return SegmentInfos.readCurrentVersion(directory);
|
||||
} finally {
|
||||
if (locked) {
|
||||
commitLock.release();
|
||||
}
|
||||
}
|
||||
}
|
||||
return SegmentInfos.readCurrentVersion(directory);
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -259,21 +270,7 @@ public abstract class IndexReader {
|
|||
* @throws IOException
|
||||
*/
|
||||
public boolean isCurrent() throws IOException {
|
||||
synchronized (directory) { // in- & inter-process sync
|
||||
Lock commitLock=directory.makeLock(IndexWriter.COMMIT_LOCK_NAME);
|
||||
|
||||
boolean locked=false;
|
||||
|
||||
try {
|
||||
locked=commitLock.obtain(IndexWriter.COMMIT_LOCK_TIMEOUT);
|
||||
|
||||
return SegmentInfos.readCurrentVersion(directory) == segmentInfos.getVersion();
|
||||
} finally {
|
||||
if (locked) {
|
||||
commitLock.release();
|
||||
}
|
||||
}
|
||||
}
|
||||
return SegmentInfos.readCurrentVersion(directory) == segmentInfos.getVersion();
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -319,7 +316,7 @@ public abstract class IndexReader {
|
|||
* @return <code>true</code> if an index exists; <code>false</code> otherwise
|
||||
*/
|
||||
public static boolean indexExists(String directory) {
|
||||
return (new File(directory, IndexFileNames.SEGMENTS)).exists();
|
||||
return indexExists(new File(directory));
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -328,8 +325,9 @@ public abstract class IndexReader {
|
|||
* @param directory the directory to check for an index
|
||||
* @return <code>true</code> if an index exists; <code>false</code> otherwise
|
||||
*/
|
||||
|
||||
public static boolean indexExists(File directory) {
|
||||
return (new File(directory, IndexFileNames.SEGMENTS)).exists();
|
||||
return SegmentInfos.getCurrentSegmentGeneration(directory.list()) != -1;
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -340,7 +338,7 @@ public abstract class IndexReader {
|
|||
* @throws IOException if there is a problem with accessing the index
|
||||
*/
|
||||
public static boolean indexExists(Directory directory) throws IOException {
|
||||
return directory.fileExists(IndexFileNames.SEGMENTS);
|
||||
return SegmentInfos.getCurrentSegmentGeneration(directory) != -1;
|
||||
}
|
||||
|
||||
/** Returns the number of documents in this index. */
|
||||
|
@ -592,17 +590,22 @@ public abstract class IndexReader {
|
|||
*/
|
||||
protected final synchronized void commit() throws IOException{
|
||||
if(hasChanges){
|
||||
if (deleter == null) {
|
||||
// In the MultiReader case, we share this deleter
|
||||
// across all SegmentReaders:
|
||||
setDeleter(new IndexFileDeleter(segmentInfos, directory));
|
||||
deleter.deleteFiles();
|
||||
}
|
||||
if(directoryOwner){
|
||||
synchronized (directory) { // in- & inter-process sync
|
||||
new Lock.With(directory.makeLock(IndexWriter.COMMIT_LOCK_NAME),
|
||||
IndexWriter.COMMIT_LOCK_TIMEOUT) {
|
||||
public Object doBody() throws IOException {
|
||||
doCommit();
|
||||
segmentInfos.write(directory);
|
||||
return null;
|
||||
}
|
||||
}.run();
|
||||
}
|
||||
deleter.clearPendingFiles();
|
||||
doCommit();
|
||||
String oldInfoFileName = segmentInfos.getCurrentSegmentFileName();
|
||||
segmentInfos.write(directory);
|
||||
// Attempt to delete all files we just obsoleted:
|
||||
|
||||
deleter.deleteFile(oldInfoFileName);
|
||||
deleter.commitPendingFiles();
|
||||
deleter.deleteFiles();
|
||||
if (writeLock != null) {
|
||||
writeLock.release(); // release write lock
|
||||
writeLock = null;
|
||||
|
@ -614,6 +617,13 @@ public abstract class IndexReader {
|
|||
hasChanges = false;
|
||||
}
|
||||
|
||||
protected void setDeleter(IndexFileDeleter deleter) {
|
||||
this.deleter = deleter;
|
||||
}
|
||||
protected IndexFileDeleter getDeleter() {
|
||||
return deleter;
|
||||
}
|
||||
|
||||
/** Implements commit. */
|
||||
protected abstract void doCommit() throws IOException;
|
||||
|
||||
|
@ -658,8 +668,7 @@ public abstract class IndexReader {
|
|||
*/
|
||||
public static boolean isLocked(Directory directory) throws IOException {
|
||||
return
|
||||
directory.makeLock(IndexWriter.WRITE_LOCK_NAME).isLocked() ||
|
||||
directory.makeLock(IndexWriter.COMMIT_LOCK_NAME).isLocked();
|
||||
directory.makeLock(IndexWriter.WRITE_LOCK_NAME).isLocked();
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -684,7 +693,6 @@ public abstract class IndexReader {
|
|||
*/
|
||||
public static void unlock(Directory directory) throws IOException {
|
||||
directory.makeLock(IndexWriter.WRITE_LOCK_NAME).release();
|
||||
directory.makeLock(IndexWriter.COMMIT_LOCK_NAME).release();
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
|
@ -67,16 +67,7 @@ public class IndexWriter {
|
|||
|
||||
private long writeLockTimeout = WRITE_LOCK_TIMEOUT;
|
||||
|
||||
/**
|
||||
* Default value for the commit lock timeout (10,000).
|
||||
* @see #setDefaultCommitLockTimeout
|
||||
*/
|
||||
public static long COMMIT_LOCK_TIMEOUT = 10000;
|
||||
|
||||
private long commitLockTimeout = COMMIT_LOCK_TIMEOUT;
|
||||
|
||||
public static final String WRITE_LOCK_NAME = "write.lock";
|
||||
public static final String COMMIT_LOCK_NAME = "commit.lock";
|
||||
|
||||
/**
|
||||
* Default value is 10. Change using {@link #setMergeFactor(int)}.
|
||||
|
@ -111,6 +102,7 @@ public class IndexWriter {
|
|||
private SegmentInfos segmentInfos = new SegmentInfos(); // the segments
|
||||
private SegmentInfos ramSegmentInfos = new SegmentInfos(); // the segments in ramDirectory
|
||||
private final Directory ramDirectory = new RAMDirectory(); // for temp segs
|
||||
private IndexFileDeleter deleter;
|
||||
|
||||
private Lock writeLock;
|
||||
|
||||
|
@ -260,19 +252,30 @@ public class IndexWriter {
|
|||
this.writeLock = writeLock; // save it
|
||||
|
||||
try {
|
||||
synchronized (directory) { // in- & inter-process sync
|
||||
new Lock.With(directory.makeLock(IndexWriter.COMMIT_LOCK_NAME), commitLockTimeout) {
|
||||
public Object doBody() throws IOException {
|
||||
if (create)
|
||||
segmentInfos.write(directory);
|
||||
else
|
||||
segmentInfos.read(directory);
|
||||
return null;
|
||||
}
|
||||
}.run();
|
||||
if (create) {
|
||||
// Try to read first. This is to allow create
|
||||
// against an index that's currently open for
|
||||
// searching. In this case we write the next
|
||||
// segments_N file with no segments:
|
||||
try {
|
||||
segmentInfos.read(directory);
|
||||
segmentInfos.clear();
|
||||
} catch (IOException e) {
|
||||
// Likely this means it's a fresh directory
|
||||
}
|
||||
segmentInfos.write(directory);
|
||||
} else {
|
||||
segmentInfos.read(directory);
|
||||
}
|
||||
|
||||
// Create a deleter to keep track of which files can
|
||||
// be deleted:
|
||||
deleter = new IndexFileDeleter(segmentInfos, directory);
|
||||
deleter.setInfoStream(infoStream);
|
||||
deleter.findDeletableFiles();
|
||||
deleter.deleteFiles();
|
||||
|
||||
} catch (IOException e) {
|
||||
// the doBody method failed
|
||||
this.writeLock.release();
|
||||
this.writeLock = null;
|
||||
throw e;
|
||||
|
@ -380,35 +383,6 @@ public class IndexWriter {
|
|||
return infoStream;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the maximum time to wait for a commit lock (in milliseconds) for this instance of IndexWriter. @see
|
||||
* @see #setDefaultCommitLockTimeout to change the default value for all instances of IndexWriter.
|
||||
*/
|
||||
public void setCommitLockTimeout(long commitLockTimeout) {
|
||||
this.commitLockTimeout = commitLockTimeout;
|
||||
}
|
||||
|
||||
/**
|
||||
* @see #setCommitLockTimeout
|
||||
*/
|
||||
public long getCommitLockTimeout() {
|
||||
return commitLockTimeout;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the default (for any instance of IndexWriter) maximum time to wait for a commit lock (in milliseconds)
|
||||
*/
|
||||
public static void setDefaultCommitLockTimeout(long commitLockTimeout) {
|
||||
IndexWriter.COMMIT_LOCK_TIMEOUT = commitLockTimeout;
|
||||
}
|
||||
|
||||
/**
|
||||
* @see #setDefaultCommitLockTimeout
|
||||
*/
|
||||
public static long getDefaultCommitLockTimeout() {
|
||||
return IndexWriter.COMMIT_LOCK_TIMEOUT;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the maximum time to wait for a write lock (in milliseconds) for this instance of IndexWriter. @see
|
||||
* @see #setDefaultWriteLockTimeout to change the default value for all instances of IndexWriter.
|
||||
|
@ -517,7 +491,7 @@ public class IndexWriter {
|
|||
String segmentName = newRAMSegmentName();
|
||||
dw.addDocument(segmentName, doc);
|
||||
synchronized (this) {
|
||||
ramSegmentInfos.addElement(new SegmentInfo(segmentName, 1, ramDirectory));
|
||||
ramSegmentInfos.addElement(new SegmentInfo(segmentName, 1, ramDirectory, false));
|
||||
maybeFlushRamSegments();
|
||||
}
|
||||
}
|
||||
|
@ -790,36 +764,26 @@ public class IndexWriter {
|
|||
int docCount = merger.merge(); // merge 'em
|
||||
|
||||
segmentInfos.setSize(0); // pop old infos & add new
|
||||
segmentInfos.addElement(new SegmentInfo(mergedName, docCount, directory));
|
||||
SegmentInfo info = new SegmentInfo(mergedName, docCount, directory, false);
|
||||
segmentInfos.addElement(info);
|
||||
|
||||
if(sReader != null)
|
||||
sReader.close();
|
||||
|
||||
synchronized (directory) { // in- & inter-process sync
|
||||
new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
|
||||
public Object doBody() throws IOException {
|
||||
segmentInfos.write(directory); // commit changes
|
||||
return null;
|
||||
}
|
||||
}.run();
|
||||
}
|
||||
String segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
|
||||
segmentInfos.write(directory); // commit changes
|
||||
|
||||
deleteSegments(segmentsToDelete); // delete now-unused segments
|
||||
deleter.deleteFile(segmentsInfosFileName); // delete old segments_N file
|
||||
deleter.deleteSegments(segmentsToDelete); // delete now-unused segments
|
||||
|
||||
if (useCompoundFile) {
|
||||
final Vector filesToDelete = merger.createCompoundFile(mergedName + ".tmp");
|
||||
synchronized (directory) { // in- & inter-process sync
|
||||
new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
|
||||
public Object doBody() throws IOException {
|
||||
// make compound file visible for SegmentReaders
|
||||
directory.renameFile(mergedName + ".tmp", mergedName + ".cfs");
|
||||
return null;
|
||||
}
|
||||
}.run();
|
||||
}
|
||||
Vector filesToDelete = merger.createCompoundFile(mergedName + ".cfs");
|
||||
segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
|
||||
info.setUseCompoundFile(true);
|
||||
segmentInfos.write(directory); // commit again so readers know we've switched this segment to a compound file
|
||||
|
||||
// delete now unused files of segment
|
||||
deleteFiles(filesToDelete);
|
||||
deleter.deleteFile(segmentsInfosFileName); // delete old segments_N file
|
||||
deleter.deleteFiles(filesToDelete); // delete now unused files of segment
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -937,10 +901,11 @@ public class IndexWriter {
|
|||
*/
|
||||
private final int mergeSegments(SegmentInfos sourceSegments, int minSegment, int end)
|
||||
throws IOException {
|
||||
|
||||
final String mergedName = newSegmentName();
|
||||
if (infoStream != null) infoStream.print("merging segments");
|
||||
SegmentMerger merger = new SegmentMerger(this, mergedName);
|
||||
|
||||
|
||||
final Vector segmentsToDelete = new Vector();
|
||||
for (int i = minSegment; i < end; i++) {
|
||||
SegmentInfo si = sourceSegments.info(i);
|
||||
|
@ -960,7 +925,7 @@ public class IndexWriter {
|
|||
}
|
||||
|
||||
SegmentInfo newSegment = new SegmentInfo(mergedName, mergedDocCount,
|
||||
directory);
|
||||
directory, false);
|
||||
if (sourceSegments == ramSegmentInfos) {
|
||||
sourceSegments.removeAllElements();
|
||||
segmentInfos.addElement(newSegment);
|
||||
|
@ -973,115 +938,26 @@ public class IndexWriter {
|
|||
// close readers before we attempt to delete now-obsolete segments
|
||||
merger.closeReaders();
|
||||
|
||||
synchronized (directory) { // in- & inter-process sync
|
||||
new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
|
||||
public Object doBody() throws IOException {
|
||||
segmentInfos.write(directory); // commit before deleting
|
||||
return null;
|
||||
}
|
||||
}.run();
|
||||
}
|
||||
|
||||
deleteSegments(segmentsToDelete); // delete now-unused segments
|
||||
String segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
|
||||
segmentInfos.write(directory); // commit before deleting
|
||||
|
||||
deleter.deleteFile(segmentsInfosFileName); // delete old segments_N file
|
||||
deleter.deleteSegments(segmentsToDelete); // delete now-unused segments
|
||||
|
||||
if (useCompoundFile) {
|
||||
final Vector filesToDelete = merger.createCompoundFile(mergedName + ".tmp");
|
||||
synchronized (directory) { // in- & inter-process sync
|
||||
new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
|
||||
public Object doBody() throws IOException {
|
||||
// make compound file visible for SegmentReaders
|
||||
directory.renameFile(mergedName + ".tmp", mergedName + ".cfs");
|
||||
return null;
|
||||
}
|
||||
}.run();
|
||||
}
|
||||
Vector filesToDelete = merger.createCompoundFile(mergedName + ".cfs");
|
||||
|
||||
// delete now unused files of segment
|
||||
deleteFiles(filesToDelete);
|
||||
segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
|
||||
newSegment.setUseCompoundFile(true);
|
||||
segmentInfos.write(directory); // commit again so readers know we've switched this segment to a compound file
|
||||
|
||||
deleter.deleteFile(segmentsInfosFileName); // delete old segments_N file
|
||||
deleter.deleteFiles(filesToDelete); // delete now-unused segments
|
||||
}
|
||||
|
||||
return mergedDocCount;
|
||||
}
|
||||
|
||||
/*
|
||||
* Some operating systems (e.g. Windows) don't permit a file to be deleted
|
||||
* while it is opened for read (e.g. by another process or thread). So we
|
||||
* assume that when a delete fails it is because the file is open in another
|
||||
* process, and queue the file for subsequent deletion.
|
||||
*/
|
||||
|
||||
private final void deleteSegments(Vector segments) throws IOException {
|
||||
Vector deletable = new Vector();
|
||||
|
||||
deleteFiles(readDeleteableFiles(), deletable); // try to delete deleteable
|
||||
|
||||
for (int i = 0; i < segments.size(); i++) {
|
||||
SegmentReader reader = (SegmentReader)segments.elementAt(i);
|
||||
if (reader.directory() == this.directory)
|
||||
deleteFiles(reader.files(), deletable); // try to delete our files
|
||||
else
|
||||
deleteFiles(reader.files(), reader.directory()); // delete other files
|
||||
}
|
||||
|
||||
writeDeleteableFiles(deletable); // note files we can't delete
|
||||
}
|
||||
|
||||
private final void deleteFiles(Vector files) throws IOException {
|
||||
Vector deletable = new Vector();
|
||||
deleteFiles(readDeleteableFiles(), deletable); // try to delete deleteable
|
||||
deleteFiles(files, deletable); // try to delete our files
|
||||
writeDeleteableFiles(deletable); // note files we can't delete
|
||||
}
|
||||
|
||||
private final void deleteFiles(Vector files, Directory directory)
|
||||
throws IOException {
|
||||
for (int i = 0; i < files.size(); i++)
|
||||
directory.deleteFile((String)files.elementAt(i));
|
||||
}
|
||||
|
||||
private final void deleteFiles(Vector files, Vector deletable)
|
||||
throws IOException {
|
||||
for (int i = 0; i < files.size(); i++) {
|
||||
String file = (String)files.elementAt(i);
|
||||
try {
|
||||
directory.deleteFile(file); // try to delete each file
|
||||
} catch (IOException e) { // if delete fails
|
||||
if (directory.fileExists(file)) {
|
||||
if (infoStream != null)
|
||||
infoStream.println(e.toString() + "; Will re-try later.");
|
||||
deletable.addElement(file); // add to deletable
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private final Vector readDeleteableFiles() throws IOException {
|
||||
Vector result = new Vector();
|
||||
if (!directory.fileExists(IndexFileNames.DELETABLE))
|
||||
return result;
|
||||
|
||||
IndexInput input = directory.openInput(IndexFileNames.DELETABLE);
|
||||
try {
|
||||
for (int i = input.readInt(); i > 0; i--) // read file names
|
||||
result.addElement(input.readString());
|
||||
} finally {
|
||||
input.close();
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
private final void writeDeleteableFiles(Vector files) throws IOException {
|
||||
IndexOutput output = directory.createOutput("deleteable.new");
|
||||
try {
|
||||
output.writeInt(files.size());
|
||||
for (int i = 0; i < files.size(); i++)
|
||||
output.writeString((String)files.elementAt(i));
|
||||
} finally {
|
||||
output.close();
|
||||
}
|
||||
directory.renameFile("deleteable.new", IndexFileNames.DELETABLE);
|
||||
}
|
||||
|
||||
private final boolean checkNonDecreasingLevels(int start) {
|
||||
int lowerBound = -1;
|
||||
int upperBound = minMergeDocs;
|
||||
|
|
|
@ -218,6 +218,13 @@ public class MultiReader extends IndexReader {
|
|||
return new MultiTermPositions(subReaders, starts);
|
||||
}
|
||||
|
||||
protected void setDeleter(IndexFileDeleter deleter) {
|
||||
// Share deleter to our SegmentReaders:
|
||||
this.deleter = deleter;
|
||||
for (int i = 0; i < subReaders.length; i++)
|
||||
subReaders[i].setDeleter(deleter);
|
||||
}
|
||||
|
||||
protected void doCommit() throws IOException {
|
||||
for (int i = 0; i < subReaders.length; i++)
|
||||
subReaders[i].commit();
|
||||
|
|
|
@ -18,15 +18,302 @@ package org.apache.lucene.index;
|
|||
*/
|
||||
|
||||
import org.apache.lucene.store.Directory;
|
||||
import org.apache.lucene.store.IndexOutput;
|
||||
import org.apache.lucene.store.IndexInput;
|
||||
import java.io.IOException;
|
||||
|
||||
final class SegmentInfo {
|
||||
public String name; // unique name in dir
|
||||
public int docCount; // number of docs in seg
|
||||
public Directory dir; // where segment resides
|
||||
|
||||
private boolean preLockless; // true if this is a segments file written before
|
||||
// lock-less commits (XXX)
|
||||
|
||||
private long delGen; // current generation of del file; -1 if there
|
||||
// are no deletes; 0 if it's a pre-XXX segment
|
||||
// (and we must check filesystem); 1 or higher if
|
||||
// there are deletes at generation N
|
||||
|
||||
private long[] normGen; // current generations of each field's norm file.
|
||||
// If this array is null, we must check filesystem
|
||||
// when preLockLess is true. Else,
|
||||
// there are no separate norms
|
||||
|
||||
private byte isCompoundFile; // -1 if it is not; 1 if it is; 0 if it's
|
||||
// pre-XXX (ie, must check file system to see
|
||||
// if <name>.cfs exists)
|
||||
|
||||
public SegmentInfo(String name, int docCount, Directory dir) {
|
||||
this.name = name;
|
||||
this.docCount = docCount;
|
||||
this.dir = dir;
|
||||
delGen = -1;
|
||||
isCompoundFile = 0;
|
||||
preLockless = true;
|
||||
}
|
||||
public SegmentInfo(String name, int docCount, Directory dir, boolean isCompoundFile) {
|
||||
this(name, docCount, dir);
|
||||
if (isCompoundFile) {
|
||||
this.isCompoundFile = 1;
|
||||
} else {
|
||||
this.isCompoundFile = -1;
|
||||
}
|
||||
preLockless = false;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Construct a new SegmentInfo instance by reading a
|
||||
* previously saved SegmentInfo from input.
|
||||
*
|
||||
* @param dir directory to load from
|
||||
* @param format format of the segments info file
|
||||
* @param input input handle to read segment info from
|
||||
*/
|
||||
public SegmentInfo(Directory dir, int format, IndexInput input) throws IOException {
|
||||
this.dir = dir;
|
||||
name = input.readString();
|
||||
docCount = input.readInt();
|
||||
if (format <= SegmentInfos.FORMAT_LOCKLESS) {
|
||||
delGen = input.readLong();
|
||||
int numNormGen = input.readInt();
|
||||
if (numNormGen == -1) {
|
||||
normGen = null;
|
||||
} else {
|
||||
normGen = new long[numNormGen];
|
||||
for(int j=0;j<numNormGen;j++) {
|
||||
normGen[j] = input.readLong();
|
||||
}
|
||||
}
|
||||
isCompoundFile = input.readByte();
|
||||
preLockless = isCompoundFile == 0;
|
||||
} else {
|
||||
delGen = 0;
|
||||
normGen = null;
|
||||
isCompoundFile = 0;
|
||||
preLockless = true;
|
||||
}
|
||||
}
|
||||
|
||||
void setNumField(int numField) {
|
||||
if (normGen == null) {
|
||||
// normGen is null if we loaded a pre-XXX segment
|
||||
// file, or, if this segments file hasn't had any
|
||||
// norms set against it yet:
|
||||
normGen = new long[numField];
|
||||
|
||||
if (!preLockless) {
|
||||
// This is a FORMAT_LOCKLESS segment, which means
|
||||
// there are no norms:
|
||||
for(int i=0;i<numField;i++) {
|
||||
normGen[i] = -1;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
boolean hasDeletions()
|
||||
throws IOException {
|
||||
// Cases:
|
||||
//
|
||||
// delGen == -1: this means this segment was written
|
||||
// by the LOCKLESS code and for certain does not have
|
||||
// deletions yet
|
||||
//
|
||||
// delGen == 0: this means this segment was written by
|
||||
// pre-LOCKLESS code which means we must check
|
||||
// directory to see if .del file exists
|
||||
//
|
||||
// delGen > 0: this means this segment was written by
|
||||
// the LOCKLESS code and for certain has
|
||||
// deletions
|
||||
//
|
||||
if (delGen == -1) {
|
||||
return false;
|
||||
} else if (delGen > 0) {
|
||||
return true;
|
||||
} else {
|
||||
return dir.fileExists(getDelFileName());
|
||||
}
|
||||
}
|
||||
|
||||
void advanceDelGen() {
|
||||
// delGen 0 is reserved for pre-LOCKLESS format
|
||||
if (delGen == -1) {
|
||||
delGen = 1;
|
||||
} else {
|
||||
delGen++;
|
||||
}
|
||||
}
|
||||
|
||||
void clearDelGen() {
|
||||
delGen = -1;
|
||||
}
|
||||
|
||||
String getDelFileName() {
|
||||
if (delGen == -1) {
|
||||
// In this case we know there is no deletion filename
|
||||
// against this segment
|
||||
return null;
|
||||
} else {
|
||||
// If delGen is 0, it's the pre-lockless-commit file format
|
||||
return IndexFileNames.fileNameFromGeneration(name, ".del", delGen);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns true if this field for this segment has saved a separate norms file (_<segment>_N.sX).
|
||||
*
|
||||
* @param fieldNumber the field index to check
|
||||
*/
|
||||
boolean hasSeparateNorms(int fieldNumber)
|
||||
throws IOException {
|
||||
if ((normGen == null && preLockless) || (normGen != null && normGen[fieldNumber] == 0)) {
|
||||
// Must fallback to directory file exists check:
|
||||
String fileName = name + ".s" + fieldNumber;
|
||||
return dir.fileExists(fileName);
|
||||
} else if (normGen == null || normGen[fieldNumber] == -1) {
|
||||
return false;
|
||||
} else {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns true if any fields in this segment have separate norms.
|
||||
*/
|
||||
boolean hasSeparateNorms()
|
||||
throws IOException {
|
||||
if (normGen == null) {
|
||||
if (!preLockless) {
|
||||
// This means we were created w/ LOCKLESS code and no
|
||||
// norms are written yet:
|
||||
return false;
|
||||
} else {
|
||||
// This means this segment was saved with pre-LOCKLESS
|
||||
// code. So we must fallback to the original
|
||||
// directory list check:
|
||||
String[] result = dir.list();
|
||||
String pattern;
|
||||
pattern = name + ".s";
|
||||
int patternLength = pattern.length();
|
||||
for(int i = 0; i < result.length; i++){
|
||||
if(result[i].startsWith(pattern) && Character.isDigit(result[i].charAt(patternLength)))
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
} else {
|
||||
// This means this segment was saved with LOCKLESS
|
||||
// code so we first check whether any normGen's are >
|
||||
// 0 (meaning they definitely have separate norms):
|
||||
for(int i=0;i<normGen.length;i++) {
|
||||
if (normGen[i] > 0) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
// Next we look for any == 0. These cases were
|
||||
// pre-LOCKLESS and must be checked in directory:
|
||||
for(int i=0;i<normGen.length;i++) {
|
||||
if (normGen[i] == 0) {
|
||||
if (dir.fileExists(getNormFileName(i))) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
/**
|
||||
* Increment the generation count for the norms file for
|
||||
* this field.
|
||||
*
|
||||
* @param fieldIndex field whose norm file will be rewritten
|
||||
*/
|
||||
void advanceNormGen(int fieldIndex) {
|
||||
if (normGen[fieldIndex] == -1) {
|
||||
normGen[fieldIndex] = 1;
|
||||
} else {
|
||||
normGen[fieldIndex]++;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the file name for the norms file for this field.
|
||||
*
|
||||
* @param number field index
|
||||
*/
|
||||
String getNormFileName(int number) throws IOException {
|
||||
String prefix;
|
||||
|
||||
long gen;
|
||||
if (normGen == null) {
|
||||
gen = 0;
|
||||
} else {
|
||||
gen = normGen[number];
|
||||
}
|
||||
|
||||
if (hasSeparateNorms(number)) {
|
||||
prefix = ".s";
|
||||
return IndexFileNames.fileNameFromGeneration(name, prefix + number, gen);
|
||||
} else {
|
||||
prefix = ".f";
|
||||
return IndexFileNames.fileNameFromGeneration(name, prefix + number, 0);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Mark whether this segment is stored as a compound file.
|
||||
*
|
||||
* @param isCompoundFile true if this is a compound file;
|
||||
* else, false
|
||||
*/
|
||||
void setUseCompoundFile(boolean isCompoundFile) {
|
||||
if (isCompoundFile) {
|
||||
this.isCompoundFile = 1;
|
||||
} else {
|
||||
this.isCompoundFile = -1;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns true if this segment is stored as a compound
|
||||
* file; else, false.
|
||||
*
|
||||
* @param directory directory to check. This parameter is
|
||||
* only used when the segment was written before version
|
||||
* XXX (at which point compound file or not became stored
|
||||
* in the segments info file).
|
||||
*/
|
||||
boolean getUseCompoundFile() throws IOException {
|
||||
if (isCompoundFile == -1) {
|
||||
return false;
|
||||
} else if (isCompoundFile == 1) {
|
||||
return true;
|
||||
} else {
|
||||
return dir.fileExists(name + ".cfs");
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Save this segment's info.
|
||||
*/
|
||||
void write(IndexOutput output)
|
||||
throws IOException {
|
||||
output.writeString(name);
|
||||
output.writeInt(docCount);
|
||||
output.writeLong(delGen);
|
||||
if (normGen == null) {
|
||||
output.writeInt(-1);
|
||||
} else {
|
||||
output.writeInt(normGen.length);
|
||||
for(int j=0;j<normGen.length;j++) {
|
||||
output.writeLong(normGen[j]);
|
||||
}
|
||||
}
|
||||
output.writeByte(isCompoundFile);
|
||||
}
|
||||
}
|
||||
|
|
|
@ -19,36 +19,151 @@ package org.apache.lucene.index;
|
|||
|
||||
import java.util.Vector;
|
||||
import java.io.IOException;
|
||||
import java.io.PrintStream;
|
||||
import java.io.File;
|
||||
import java.io.FileNotFoundException;
|
||||
import org.apache.lucene.store.Directory;
|
||||
import org.apache.lucene.store.IndexInput;
|
||||
import org.apache.lucene.store.IndexOutput;
|
||||
import org.apache.lucene.util.Constants;
|
||||
|
||||
final class SegmentInfos extends Vector {
|
||||
public final class SegmentInfos extends Vector {
|
||||
|
||||
/** The file format version, a negative number. */
|
||||
/* Works since counter, the old 1st entry, is always >= 0 */
|
||||
public static final int FORMAT = -1;
|
||||
|
||||
|
||||
/** This is the current file format written. It differs
|
||||
* slightly from the previous format in that file names
|
||||
* are never re-used (write once). Instead, each file is
|
||||
* written to the next generation. For example,
|
||||
* segments_1, segments_2, etc. This allows us to not use
|
||||
* a commit lock. See <a
|
||||
* href="http://lucene.apache.org/java/docs/fileformats.html">file
|
||||
* formats</a> for details.
|
||||
*/
|
||||
public static final int FORMAT_LOCKLESS = -2;
|
||||
|
||||
public int counter = 0; // used to name new segments
|
||||
/**
|
||||
* counts how often the index has been changed by adding or deleting docs.
|
||||
* starting with the current time in milliseconds forces to create unique version numbers.
|
||||
*/
|
||||
private long version = System.currentTimeMillis();
|
||||
private long generation = 0; // generation of the "segments_N" file we read
|
||||
|
||||
/**
|
||||
* If non-null, information about loading segments_N files
|
||||
* will be printed here. @see #setInfoStream.
|
||||
*/
|
||||
private static PrintStream infoStream;
|
||||
|
||||
public final SegmentInfo info(int i) {
|
||||
return (SegmentInfo) elementAt(i);
|
||||
}
|
||||
|
||||
public final void read(Directory directory) throws IOException {
|
||||
|
||||
IndexInput input = directory.openInput(IndexFileNames.SEGMENTS);
|
||||
/**
|
||||
* Get the generation (N) of the current segments_N file
|
||||
* from a list of files.
|
||||
*
|
||||
* @param files -- array of file names to check
|
||||
*/
|
||||
public static long getCurrentSegmentGeneration(String[] files) {
|
||||
if (files == null) {
|
||||
return -1;
|
||||
}
|
||||
long max = -1;
|
||||
int prefixLen = IndexFileNames.SEGMENTS.length()+1;
|
||||
for (int i = 0; i < files.length; i++) {
|
||||
String file = files[i];
|
||||
if (file.startsWith(IndexFileNames.SEGMENTS) && !file.equals(IndexFileNames.SEGMENTS_GEN)) {
|
||||
if (file.equals(IndexFileNames.SEGMENTS)) {
|
||||
// Pre lock-less commits:
|
||||
if (max == -1) {
|
||||
max = 0;
|
||||
}
|
||||
} else {
|
||||
long v = Long.parseLong(file.substring(prefixLen), Character.MAX_RADIX);
|
||||
if (v > max) {
|
||||
max = v;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return max;
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the generation (N) of the current segments_N file
|
||||
* in the directory.
|
||||
*
|
||||
* @param directory -- directory to search for the latest segments_N file
|
||||
*/
|
||||
public static long getCurrentSegmentGeneration(Directory directory) throws IOException {
|
||||
String[] files = directory.list();
|
||||
if (files == null)
|
||||
throw new IOException("Cannot read directory " + directory);
|
||||
return getCurrentSegmentGeneration(files);
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the filename of the current segments_N file
|
||||
* from a list of files.
|
||||
*
|
||||
* @param files -- array of file names to check
|
||||
*/
|
||||
|
||||
public static String getCurrentSegmentFileName(String[] files) throws IOException {
|
||||
return IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||
"",
|
||||
getCurrentSegmentGeneration(files));
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the filename of the current segments_N file
|
||||
* in the directory.
|
||||
*
|
||||
* @param directory -- directory to search for the latest segments_N file
|
||||
*/
|
||||
public static String getCurrentSegmentFileName(Directory directory) throws IOException {
|
||||
return IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||
"",
|
||||
getCurrentSegmentGeneration(directory));
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the segment_N filename in use by this segment infos.
|
||||
*/
|
||||
public String getCurrentSegmentFileName() {
|
||||
return IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||
"",
|
||||
generation);
|
||||
}
|
||||
|
||||
/**
|
||||
* Read a particular segmentFileName. Note that this may
|
||||
* throw an IOException if a commit is in process.
|
||||
*
|
||||
* @param directory -- directory containing the segments file
|
||||
* @param segmentFileName -- segment file to load
|
||||
*/
|
||||
public final void read(Directory directory, String segmentFileName) throws IOException {
|
||||
boolean success = false;
|
||||
|
||||
IndexInput input = directory.openInput(segmentFileName);
|
||||
|
||||
if (segmentFileName.equals(IndexFileNames.SEGMENTS)) {
|
||||
generation = 0;
|
||||
} else {
|
||||
generation = Long.parseLong(segmentFileName.substring(1+IndexFileNames.SEGMENTS.length()),
|
||||
Character.MAX_RADIX);
|
||||
}
|
||||
|
||||
try {
|
||||
int format = input.readInt();
|
||||
if(format < 0){ // file contains explicit format info
|
||||
// check that it is a format we can understand
|
||||
if (format < FORMAT)
|
||||
if (format < FORMAT_LOCKLESS)
|
||||
throw new IOException("Unknown format version: " + format);
|
||||
version = input.readLong(); // read version
|
||||
counter = input.readInt(); // read counter
|
||||
|
@ -58,9 +173,7 @@ final class SegmentInfos extends Vector {
|
|||
}
|
||||
|
||||
for (int i = input.readInt(); i > 0; i--) { // read segmentInfos
|
||||
SegmentInfo si =
|
||||
new SegmentInfo(input.readString(), input.readInt(), directory);
|
||||
addElement(si);
|
||||
addElement(new SegmentInfo(directory, format, input));
|
||||
}
|
||||
|
||||
if(format >= 0){ // in old format the version number may be at the end of the file
|
||||
|
@ -69,31 +182,71 @@ final class SegmentInfos extends Vector {
|
|||
else
|
||||
version = input.readLong(); // read version
|
||||
}
|
||||
success = true;
|
||||
}
|
||||
finally {
|
||||
input.close();
|
||||
if (!success) {
|
||||
// Clear any segment infos we had loaded so we
|
||||
// have a clean slate on retry:
|
||||
clear();
|
||||
}
|
||||
}
|
||||
}
|
||||
/**
|
||||
* This version of read uses the retry logic (for lock-less
|
||||
* commits) to find the right segments file to load.
|
||||
*/
|
||||
public final void read(Directory directory) throws IOException {
|
||||
|
||||
generation = -1;
|
||||
|
||||
new FindSegmentsFile(directory) {
|
||||
|
||||
public Object doBody(String segmentFileName) throws IOException {
|
||||
read(directory, segmentFileName);
|
||||
return null;
|
||||
}
|
||||
}.run();
|
||||
}
|
||||
|
||||
public final void write(Directory directory) throws IOException {
|
||||
IndexOutput output = directory.createOutput("segments.new");
|
||||
|
||||
// Always advance the generation on write:
|
||||
if (generation == -1) {
|
||||
generation = 1;
|
||||
} else {
|
||||
generation++;
|
||||
}
|
||||
|
||||
String segmentFileName = getCurrentSegmentFileName();
|
||||
IndexOutput output = directory.createOutput(segmentFileName);
|
||||
|
||||
try {
|
||||
output.writeInt(FORMAT); // write FORMAT
|
||||
output.writeLong(++version); // every write changes the index
|
||||
output.writeInt(FORMAT_LOCKLESS); // write FORMAT
|
||||
output.writeLong(++version); // every write changes
|
||||
// the index
|
||||
output.writeInt(counter); // write counter
|
||||
output.writeInt(size()); // write infos
|
||||
for (int i = 0; i < size(); i++) {
|
||||
SegmentInfo si = info(i);
|
||||
output.writeString(si.name);
|
||||
output.writeInt(si.docCount);
|
||||
si.write(output);
|
||||
}
|
||||
}
|
||||
finally {
|
||||
output.close();
|
||||
}
|
||||
|
||||
// install new segment info
|
||||
directory.renameFile("segments.new", IndexFileNames.SEGMENTS);
|
||||
try {
|
||||
output = directory.createOutput(IndexFileNames.SEGMENTS_GEN);
|
||||
output.writeInt(FORMAT_LOCKLESS);
|
||||
output.writeLong(generation);
|
||||
output.writeLong(generation);
|
||||
output.close();
|
||||
} catch (IOException e) {
|
||||
// It's OK if we fail to write this file since it's
|
||||
// used only as one of the retry fallbacks.
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -108,30 +261,322 @@ final class SegmentInfos extends Vector {
|
|||
*/
|
||||
public static long readCurrentVersion(Directory directory)
|
||||
throws IOException {
|
||||
|
||||
return ((Long) new FindSegmentsFile(directory) {
|
||||
public Object doBody(String segmentFileName) throws IOException {
|
||||
|
||||
IndexInput input = directory.openInput(segmentFileName);
|
||||
|
||||
int format = 0;
|
||||
long version = 0;
|
||||
try {
|
||||
format = input.readInt();
|
||||
if(format < 0){
|
||||
if (format < FORMAT_LOCKLESS)
|
||||
throw new IOException("Unknown format version: " + format);
|
||||
version = input.readLong(); // read version
|
||||
}
|
||||
}
|
||||
finally {
|
||||
input.close();
|
||||
}
|
||||
|
||||
if(format < 0)
|
||||
return new Long(version);
|
||||
|
||||
// We cannot be sure about the format of the file.
|
||||
// Therefore we have to read the whole file and cannot simply seek to the version entry.
|
||||
SegmentInfos sis = new SegmentInfos();
|
||||
sis.read(directory, segmentFileName);
|
||||
return new Long(sis.getVersion());
|
||||
}
|
||||
}.run()).longValue();
|
||||
}
|
||||
|
||||
/** If non-null, information about retries when loading
|
||||
* the segments file will be printed to this.
|
||||
*/
|
||||
public static void setInfoStream(PrintStream infoStream) {
|
||||
SegmentInfos.infoStream = infoStream;
|
||||
}
|
||||
|
||||
/* Advanced configuration of retry logic in loading
|
||||
segments_N file */
|
||||
private static int defaultGenFileRetryCount = 10;
|
||||
private static int defaultGenFileRetryPauseMsec = 50;
|
||||
private static int defaultGenLookaheadCount = 10;
|
||||
|
||||
/**
|
||||
* Advanced: set how many times to try loading the
|
||||
* segments.gen file contents to determine current segment
|
||||
* generation. This file is only referenced when the
|
||||
* primary method (listing the directory) fails.
|
||||
*/
|
||||
public static void setDefaultGenFileRetryCount(int count) {
|
||||
defaultGenFileRetryCount = count;
|
||||
}
|
||||
|
||||
/**
|
||||
* @see #setDefaultGenFileRetryCount
|
||||
*/
|
||||
public static int getDefaultGenFileRetryCount() {
|
||||
return defaultGenFileRetryCount;
|
||||
}
|
||||
|
||||
/**
|
||||
* Advanced: set how many milliseconds to pause in between
|
||||
* attempts to load the segments.gen file.
|
||||
*/
|
||||
public static void setDefaultGenFileRetryPauseMsec(int msec) {
|
||||
defaultGenFileRetryPauseMsec = msec;
|
||||
}
|
||||
|
||||
/**
|
||||
* @see #setDefaultGenFileRetryPauseMsec
|
||||
*/
|
||||
public static int getDefaultGenFileRetryPauseMsec() {
|
||||
return defaultGenFileRetryPauseMsec;
|
||||
}
|
||||
|
||||
/**
|
||||
* Advanced: set how many times to try incrementing the
|
||||
* gen when loading the segments file. This only runs if
|
||||
* the primary (listing directory) and secondary (opening
|
||||
* segments.gen file) methods fail to find the segments
|
||||
* file.
|
||||
*/
|
||||
public static void setDefaultGenLookaheadCount(int count) {
|
||||
defaultGenLookaheadCount = count;
|
||||
}
|
||||
/**
|
||||
* @see #setDefaultGenLookaheadCount
|
||||
*/
|
||||
public static int getDefaultGenLookahedCount() {
|
||||
return defaultGenLookaheadCount;
|
||||
}
|
||||
|
||||
/**
|
||||
* @see #setInfoStream
|
||||
*/
|
||||
public static PrintStream getInfoStream() {
|
||||
return infoStream;
|
||||
}
|
||||
|
||||
private static void message(String message) {
|
||||
if (infoStream != null) {
|
||||
infoStream.println(Thread.currentThread().getName() + ": " + message);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Utility class for executing code that needs to do
|
||||
* something with the current segments file. This is
|
||||
* necessary with lock-less commits because from the time
|
||||
* you locate the current segments file name, until you
|
||||
* actually open it, read its contents, or check modified
|
||||
* time, etc., it could have been deleted due to a writer
|
||||
* commit finishing.
|
||||
*/
|
||||
public abstract static class FindSegmentsFile {
|
||||
|
||||
File fileDirectory;
|
||||
Directory directory;
|
||||
|
||||
public FindSegmentsFile(File directory) {
|
||||
this.fileDirectory = directory;
|
||||
}
|
||||
|
||||
public FindSegmentsFile(Directory directory) {
|
||||
this.directory = directory;
|
||||
}
|
||||
|
||||
public Object run() throws IOException {
|
||||
String segmentFileName = null;
|
||||
long lastGen = -1;
|
||||
long gen = 0;
|
||||
int genLookaheadCount = 0;
|
||||
IOException exc = null;
|
||||
boolean retry = false;
|
||||
|
||||
int method = 0;
|
||||
|
||||
// Loop until we succeed in calling doBody() without
|
||||
// hitting an IOException. An IOException most likely
|
||||
// means a commit was in process and has finished, in
|
||||
// the time it took us to load the now-old infos files
|
||||
// (and segments files). It's also possible it's a
|
||||
// true error (corrupt index). To distinguish these,
|
||||
// on each retry we must see "forward progress" on
|
||||
// which generation we are trying to load. If we
|
||||
// don't, then the original error is real and we throw
|
||||
// it.
|
||||
|
||||
IndexInput input = directory.openInput(IndexFileNames.SEGMENTS);
|
||||
int format = 0;
|
||||
long version = 0;
|
||||
try {
|
||||
format = input.readInt();
|
||||
if(format < 0){
|
||||
if (format < FORMAT)
|
||||
throw new IOException("Unknown format version: " + format);
|
||||
version = input.readLong(); // read version
|
||||
// We have three methods for determining the current
|
||||
// generation. We try each in sequence.
|
||||
|
||||
while(true) {
|
||||
|
||||
// Method 1: list the directory and use the highest
|
||||
// segments_N file. This method works well as long
|
||||
// as there is no stale caching on the directory
|
||||
// contents:
|
||||
String[] files = null;
|
||||
|
||||
if (0 == method) {
|
||||
if (directory != null) {
|
||||
files = directory.list();
|
||||
} else {
|
||||
files = fileDirectory.list();
|
||||
}
|
||||
|
||||
gen = getCurrentSegmentGeneration(files);
|
||||
|
||||
if (gen == -1) {
|
||||
String s = "";
|
||||
for(int i=0;i<files.length;i++) {
|
||||
s += " " + files[i];
|
||||
}
|
||||
throw new FileNotFoundException("no segments* file found: files:" + s);
|
||||
}
|
||||
}
|
||||
|
||||
// Method 2 (fallback if Method 1 isn't reliable):
|
||||
// if the directory listing seems to be stale, then
|
||||
// try loading the "segments.gen" file.
|
||||
if (1 == method || (0 == method && lastGen == gen && retry)) {
|
||||
|
||||
method = 1;
|
||||
|
||||
for(int i=0;i<defaultGenFileRetryCount;i++) {
|
||||
IndexInput genInput = null;
|
||||
try {
|
||||
genInput = directory.openInput(IndexFileNames.SEGMENTS_GEN);
|
||||
} catch (IOException e) {
|
||||
message("segments.gen open: IOException " + e);
|
||||
}
|
||||
if (genInput != null) {
|
||||
|
||||
try {
|
||||
int version = genInput.readInt();
|
||||
if (version == FORMAT_LOCKLESS) {
|
||||
long gen0 = genInput.readLong();
|
||||
long gen1 = genInput.readLong();
|
||||
message("fallback check: " + gen0 + "; " + gen1);
|
||||
if (gen0 == gen1) {
|
||||
// The file is consistent.
|
||||
if (gen0 > gen) {
|
||||
message("fallback to '" + IndexFileNames.SEGMENTS_GEN + "' check: now try generation " + gen0 + " > " + gen);
|
||||
gen = gen0;
|
||||
}
|
||||
break;
|
||||
}
|
||||
}
|
||||
} catch (IOException err2) {
|
||||
// will retry
|
||||
} finally {
|
||||
genInput.close();
|
||||
}
|
||||
}
|
||||
try {
|
||||
Thread.sleep(defaultGenFileRetryPauseMsec);
|
||||
} catch (InterruptedException e) {
|
||||
// will retry
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Method 3 (fallback if Methods 2 & 3 are not
|
||||
// reliabel): since both directory cache and file
|
||||
// contents cache seem to be stale, just advance the
|
||||
// generation.
|
||||
if (2 == method || (1 == method && lastGen == gen && retry)) {
|
||||
|
||||
method = 2;
|
||||
|
||||
if (genLookaheadCount < defaultGenLookaheadCount) {
|
||||
gen++;
|
||||
genLookaheadCount++;
|
||||
message("look ahead incremenent gen to " + gen);
|
||||
}
|
||||
}
|
||||
|
||||
if (lastGen == gen) {
|
||||
|
||||
// This means we're about to try the same
|
||||
// segments_N last tried. This is allowed,
|
||||
// exactly once, because writer could have been in
|
||||
// the process of writing segments_N last time.
|
||||
|
||||
if (retry) {
|
||||
// OK, we've tried the same segments_N file
|
||||
// twice in a row, so this must be a real
|
||||
// error. We throw the original exception we
|
||||
// got.
|
||||
throw exc;
|
||||
} else {
|
||||
retry = true;
|
||||
}
|
||||
|
||||
} else {
|
||||
// Segment file has advanced since our last loop, so
|
||||
// reset retry:
|
||||
retry = false;
|
||||
}
|
||||
|
||||
lastGen = gen;
|
||||
|
||||
segmentFileName = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||
"",
|
||||
gen);
|
||||
|
||||
try {
|
||||
Object v = doBody(segmentFileName);
|
||||
if (exc != null) {
|
||||
message("success on " + segmentFileName);
|
||||
}
|
||||
return v;
|
||||
} catch (IOException err) {
|
||||
|
||||
// Save the original root cause:
|
||||
if (exc == null) {
|
||||
exc = err;
|
||||
}
|
||||
|
||||
message("primary Exception on '" + segmentFileName + "': " + err + "'; will retry: retry=" + retry + "; gen = " + gen);
|
||||
|
||||
if (!retry && gen > 1) {
|
||||
|
||||
// This is our first time trying this segments
|
||||
// file (because retry is false), and, there is
|
||||
// possibly a segments_(N-1) (because gen > 1).
|
||||
// So, check if the segments_(N-1) exists and
|
||||
// try it if so:
|
||||
String prevSegmentFileName = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||
"",
|
||||
gen-1);
|
||||
|
||||
if (directory.fileExists(prevSegmentFileName)) {
|
||||
message("fallback to prior segment file '" + prevSegmentFileName + "'");
|
||||
try {
|
||||
Object v = doBody(prevSegmentFileName);
|
||||
if (exc != null) {
|
||||
message("success on fallback " + prevSegmentFileName);
|
||||
}
|
||||
return v;
|
||||
} catch (IOException err2) {
|
||||
message("secondary Exception on '" + prevSegmentFileName + "': " + err2 + "'; will retry");
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
finally {
|
||||
input.close();
|
||||
}
|
||||
|
||||
if(format < 0)
|
||||
return version;
|
||||
|
||||
// We cannot be sure about the format of the file.
|
||||
// Therefore we have to read the whole file and cannot simply seek to the version entry.
|
||||
|
||||
SegmentInfos sis = new SegmentInfos();
|
||||
sis.read(directory);
|
||||
return sis.getVersion();
|
||||
}
|
||||
/**
|
||||
* Subclass must implement this. The assumption is an
|
||||
* IOException will be thrown if something goes wrong
|
||||
* during the processing that could have been caused by
|
||||
* a writer committing.
|
||||
*/
|
||||
protected abstract Object doBody(String segmentFileName) throws IOException;}
|
||||
}
|
||||
|
|
|
@ -33,6 +33,7 @@ import java.util.*;
|
|||
*/
|
||||
class SegmentReader extends IndexReader {
|
||||
private String segment;
|
||||
private SegmentInfo si;
|
||||
|
||||
FieldInfos fieldInfos;
|
||||
private FieldsReader fieldsReader;
|
||||
|
@ -64,22 +65,24 @@ class SegmentReader extends IndexReader {
|
|||
private boolean dirty;
|
||||
private int number;
|
||||
|
||||
private void reWrite() throws IOException {
|
||||
private void reWrite(SegmentInfo si) throws IOException {
|
||||
// NOTE: norms are re-written in regular directory, not cfs
|
||||
IndexOutput out = directory().createOutput(segment + ".tmp");
|
||||
|
||||
String oldFileName = si.getNormFileName(this.number);
|
||||
if (oldFileName != null) {
|
||||
// Mark this file for deletion. Note that we don't
|
||||
// actually try to delete it until the new segments files is
|
||||
// successfully written:
|
||||
deleter.addPendingFile(oldFileName);
|
||||
}
|
||||
|
||||
si.advanceNormGen(this.number);
|
||||
IndexOutput out = directory().createOutput(si.getNormFileName(this.number));
|
||||
try {
|
||||
out.writeBytes(bytes, maxDoc());
|
||||
} finally {
|
||||
out.close();
|
||||
}
|
||||
String fileName;
|
||||
if(cfsReader == null)
|
||||
fileName = segment + ".f" + number;
|
||||
else{
|
||||
// use a different file name if we have compound format
|
||||
fileName = segment + ".s" + number;
|
||||
}
|
||||
directory().renameFile(segment + ".tmp", fileName);
|
||||
this.dirty = false;
|
||||
}
|
||||
}
|
||||
|
@ -131,57 +134,94 @@ class SegmentReader extends IndexReader {
|
|||
return instance;
|
||||
}
|
||||
|
||||
private void initialize(SegmentInfo si) throws IOException {
|
||||
private void initialize(SegmentInfo si) throws IOException {
|
||||
segment = si.name;
|
||||
this.si = si;
|
||||
|
||||
// Use compound file directory for some files, if it exists
|
||||
Directory cfsDir = directory();
|
||||
if (directory().fileExists(segment + ".cfs")) {
|
||||
cfsReader = new CompoundFileReader(directory(), segment + ".cfs");
|
||||
cfsDir = cfsReader;
|
||||
}
|
||||
boolean success = false;
|
||||
|
||||
// No compound file exists - use the multi-file format
|
||||
fieldInfos = new FieldInfos(cfsDir, segment + ".fnm");
|
||||
fieldsReader = new FieldsReader(cfsDir, segment, fieldInfos);
|
||||
try {
|
||||
// Use compound file directory for some files, if it exists
|
||||
Directory cfsDir = directory();
|
||||
if (si.getUseCompoundFile()) {
|
||||
cfsReader = new CompoundFileReader(directory(), segment + ".cfs");
|
||||
cfsDir = cfsReader;
|
||||
}
|
||||
|
||||
tis = new TermInfosReader(cfsDir, segment, fieldInfos);
|
||||
// No compound file exists - use the multi-file format
|
||||
fieldInfos = new FieldInfos(cfsDir, segment + ".fnm");
|
||||
fieldsReader = new FieldsReader(cfsDir, segment, fieldInfos);
|
||||
|
||||
// NOTE: the bitvector is stored using the regular directory, not cfs
|
||||
if (hasDeletions(si))
|
||||
deletedDocs = new BitVector(directory(), segment + ".del");
|
||||
tis = new TermInfosReader(cfsDir, segment, fieldInfos);
|
||||
|
||||
// NOTE: the bitvector is stored using the regular directory, not cfs
|
||||
if (hasDeletions(si)) {
|
||||
deletedDocs = new BitVector(directory(), si.getDelFileName());
|
||||
}
|
||||
|
||||
// make sure that all index files have been read or are kept open
|
||||
// so that if an index update removes them we'll still have them
|
||||
freqStream = cfsDir.openInput(segment + ".frq");
|
||||
proxStream = cfsDir.openInput(segment + ".prx");
|
||||
openNorms(cfsDir);
|
||||
// make sure that all index files have been read or are kept open
|
||||
// so that if an index update removes them we'll still have them
|
||||
freqStream = cfsDir.openInput(segment + ".frq");
|
||||
proxStream = cfsDir.openInput(segment + ".prx");
|
||||
openNorms(cfsDir);
|
||||
|
||||
if (fieldInfos.hasVectors()) { // open term vector files only as needed
|
||||
termVectorsReaderOrig = new TermVectorsReader(cfsDir, segment, fieldInfos);
|
||||
if (fieldInfos.hasVectors()) { // open term vector files only as needed
|
||||
termVectorsReaderOrig = new TermVectorsReader(cfsDir, segment, fieldInfos);
|
||||
}
|
||||
success = true;
|
||||
} finally {
|
||||
|
||||
// With lock-less commits, it's entirely possible (and
|
||||
// fine) to hit a FileNotFound exception above. In
|
||||
// this case, we want to explicitly close any subset
|
||||
// of things that were opened so that we don't have to
|
||||
// wait for a GC to do so.
|
||||
if (!success) {
|
||||
doClose();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
protected void finalize() {
|
||||
protected void finalize() {
|
||||
// patch for pre-1.4.2 JVMs, whose ThreadLocals leak
|
||||
termVectorsLocal.set(null);
|
||||
super.finalize();
|
||||
}
|
||||
}
|
||||
|
||||
protected void doCommit() throws IOException {
|
||||
if (deletedDocsDirty) { // re-write deleted
|
||||
deletedDocs.write(directory(), segment + ".tmp");
|
||||
directory().renameFile(segment + ".tmp", segment + ".del");
|
||||
String oldDelFileName = si.getDelFileName();
|
||||
if (oldDelFileName != null) {
|
||||
// Mark this file for deletion. Note that we don't
|
||||
// actually try to delete it until the new segments files is
|
||||
// successfully written:
|
||||
deleter.addPendingFile(oldDelFileName);
|
||||
}
|
||||
|
||||
si.advanceDelGen();
|
||||
|
||||
// We can write directly to the actual name (vs to a
|
||||
// .tmp & renaming it) because the file is not live
|
||||
// until segments file is written:
|
||||
deletedDocs.write(directory(), si.getDelFileName());
|
||||
}
|
||||
if(undeleteAll && directory().fileExists(segment + ".del")){
|
||||
directory().deleteFile(segment + ".del");
|
||||
if (undeleteAll && si.hasDeletions()) {
|
||||
String oldDelFileName = si.getDelFileName();
|
||||
if (oldDelFileName != null) {
|
||||
// Mark this file for deletion. Note that we don't
|
||||
// actually try to delete it until the new segments files is
|
||||
// successfully written:
|
||||
deleter.addPendingFile(oldDelFileName);
|
||||
}
|
||||
si.clearDelGen();
|
||||
}
|
||||
if (normsDirty) { // re-write norms
|
||||
si.setNumField(fieldInfos.size());
|
||||
Enumeration values = norms.elements();
|
||||
while (values.hasMoreElements()) {
|
||||
Norm norm = (Norm) values.nextElement();
|
||||
if (norm.dirty) {
|
||||
norm.reWrite();
|
||||
norm.reWrite(si);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -191,8 +231,12 @@ class SegmentReader extends IndexReader {
|
|||
}
|
||||
|
||||
protected void doClose() throws IOException {
|
||||
fieldsReader.close();
|
||||
tis.close();
|
||||
if (fieldsReader != null) {
|
||||
fieldsReader.close();
|
||||
}
|
||||
if (tis != null) {
|
||||
tis.close();
|
||||
}
|
||||
|
||||
if (freqStream != null)
|
||||
freqStream.close();
|
||||
|
@ -209,27 +253,19 @@ class SegmentReader extends IndexReader {
|
|||
}
|
||||
|
||||
static boolean hasDeletions(SegmentInfo si) throws IOException {
|
||||
return si.dir.fileExists(si.name + ".del");
|
||||
return si.hasDeletions();
|
||||
}
|
||||
|
||||
public boolean hasDeletions() {
|
||||
return deletedDocs != null;
|
||||
}
|
||||
|
||||
|
||||
static boolean usesCompoundFile(SegmentInfo si) throws IOException {
|
||||
return si.dir.fileExists(si.name + ".cfs");
|
||||
return si.getUseCompoundFile();
|
||||
}
|
||||
|
||||
static boolean hasSeparateNorms(SegmentInfo si) throws IOException {
|
||||
String[] result = si.dir.list();
|
||||
String pattern = si.name + ".s";
|
||||
int patternLength = pattern.length();
|
||||
for(int i = 0; i < result.length; i++){
|
||||
if(result[i].startsWith(pattern) && Character.isDigit(result[i].charAt(patternLength)))
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
return si.hasSeparateNorms();
|
||||
}
|
||||
|
||||
protected void doDelete(int docNum) {
|
||||
|
@ -249,23 +285,27 @@ class SegmentReader extends IndexReader {
|
|||
Vector files() throws IOException {
|
||||
Vector files = new Vector(16);
|
||||
|
||||
for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) {
|
||||
String name = segment + "." + IndexFileNames.INDEX_EXTENSIONS[i];
|
||||
if (directory().fileExists(name))
|
||||
if (si.getUseCompoundFile()) {
|
||||
String name = segment + ".cfs";
|
||||
if (directory().fileExists(name)) {
|
||||
files.addElement(name);
|
||||
}
|
||||
} else {
|
||||
for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) {
|
||||
String name = segment + "." + IndexFileNames.INDEX_EXTENSIONS[i];
|
||||
if (directory().fileExists(name))
|
||||
files.addElement(name);
|
||||
}
|
||||
}
|
||||
|
||||
if (si.hasDeletions()) {
|
||||
files.addElement(si.getDelFileName());
|
||||
}
|
||||
|
||||
for (int i = 0; i < fieldInfos.size(); i++) {
|
||||
FieldInfo fi = fieldInfos.fieldInfo(i);
|
||||
if (fi.isIndexed && !fi.omitNorms){
|
||||
String name;
|
||||
if(cfsReader == null)
|
||||
name = segment + ".f" + i;
|
||||
else
|
||||
name = segment + ".s" + i;
|
||||
if (directory().fileExists(name))
|
||||
String name = si.getNormFileName(i);
|
||||
if (name != null && directory().fileExists(name))
|
||||
files.addElement(name);
|
||||
}
|
||||
}
|
||||
return files;
|
||||
}
|
||||
|
@ -380,7 +420,6 @@ class SegmentReader extends IndexReader {
|
|||
protected synchronized byte[] getNorms(String field) throws IOException {
|
||||
Norm norm = (Norm) norms.get(field);
|
||||
if (norm == null) return null; // not indexed, or norms not stored
|
||||
|
||||
if (norm.bytes == null) { // value not yet read
|
||||
byte[] bytes = new byte[maxDoc()];
|
||||
norms(field, bytes, 0);
|
||||
|
@ -436,12 +475,10 @@ class SegmentReader extends IndexReader {
|
|||
for (int i = 0; i < fieldInfos.size(); i++) {
|
||||
FieldInfo fi = fieldInfos.fieldInfo(i);
|
||||
if (fi.isIndexed && !fi.omitNorms) {
|
||||
// look first if there are separate norms in compound format
|
||||
String fileName = segment + ".s" + fi.number;
|
||||
Directory d = directory();
|
||||
if(!d.fileExists(fileName)){
|
||||
fileName = segment + ".f" + fi.number;
|
||||
d = cfsDir;
|
||||
String fileName = si.getNormFileName(fi.number);
|
||||
if (!si.hasSeparateNorms(fi.number)) {
|
||||
d = cfsDir;
|
||||
}
|
||||
norms.put(fi.name, new Norm(d.openInput(fileName), fi.number));
|
||||
}
|
||||
|
|
|
@ -128,7 +128,7 @@ public class FSDirectory extends Directory {
|
|||
* @return the FSDirectory for the named file. */
|
||||
public static FSDirectory getDirectory(String path, boolean create)
|
||||
throws IOException {
|
||||
return getDirectory(path, create, null);
|
||||
return getDirectory(new File(path), create, null, true);
|
||||
}
|
||||
|
||||
/** Returns the directory instance for the named location, using the
|
||||
|
@ -143,10 +143,16 @@ public class FSDirectory extends Directory {
|
|||
* @param lockFactory instance of {@link LockFactory} providing the
|
||||
* locking implementation.
|
||||
* @return the FSDirectory for the named file. */
|
||||
public static FSDirectory getDirectory(String path, boolean create,
|
||||
LockFactory lockFactory, boolean doRemoveOldFiles)
|
||||
throws IOException {
|
||||
return getDirectory(new File(path), create, lockFactory, doRemoveOldFiles);
|
||||
}
|
||||
|
||||
public static FSDirectory getDirectory(String path, boolean create,
|
||||
LockFactory lockFactory)
|
||||
throws IOException {
|
||||
return getDirectory(new File(path), create, lockFactory);
|
||||
return getDirectory(new File(path), create, lockFactory, true);
|
||||
}
|
||||
|
||||
/** Returns the directory instance for the named location.
|
||||
|
@ -158,9 +164,9 @@ public class FSDirectory extends Directory {
|
|||
* @param file the path to the directory.
|
||||
* @param create if true, create, or erase any existing contents.
|
||||
* @return the FSDirectory for the named file. */
|
||||
public static FSDirectory getDirectory(File file, boolean create)
|
||||
public static FSDirectory getDirectory(File file, boolean create, boolean doRemoveOldFiles)
|
||||
throws IOException {
|
||||
return getDirectory(file, create, null);
|
||||
return getDirectory(file, create, null, doRemoveOldFiles);
|
||||
}
|
||||
|
||||
/** Returns the directory instance for the named location, using the
|
||||
|
@ -176,7 +182,7 @@ public class FSDirectory extends Directory {
|
|||
* locking implementation.
|
||||
* @return the FSDirectory for the named file. */
|
||||
public static FSDirectory getDirectory(File file, boolean create,
|
||||
LockFactory lockFactory)
|
||||
LockFactory lockFactory, boolean doRemoveOldFiles)
|
||||
throws IOException {
|
||||
file = new File(file.getCanonicalPath());
|
||||
FSDirectory dir;
|
||||
|
@ -188,7 +194,7 @@ public class FSDirectory extends Directory {
|
|||
} catch (Exception e) {
|
||||
throw new RuntimeException("cannot load FSDirectory class: " + e.toString(), e);
|
||||
}
|
||||
dir.init(file, create, lockFactory);
|
||||
dir.init(file, create, lockFactory, doRemoveOldFiles);
|
||||
DIRECTORIES.put(file, dir);
|
||||
} else {
|
||||
|
||||
|
@ -199,7 +205,7 @@ public class FSDirectory extends Directory {
|
|||
}
|
||||
|
||||
if (create) {
|
||||
dir.create();
|
||||
dir.create(doRemoveOldFiles);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -209,23 +215,35 @@ public class FSDirectory extends Directory {
|
|||
return dir;
|
||||
}
|
||||
|
||||
public static FSDirectory getDirectory(File file, boolean create,
|
||||
LockFactory lockFactory)
|
||||
throws IOException
|
||||
{
|
||||
return getDirectory(file, create, lockFactory, true);
|
||||
}
|
||||
|
||||
public static FSDirectory getDirectory(File file, boolean create)
|
||||
throws IOException {
|
||||
return getDirectory(file, create, true);
|
||||
}
|
||||
|
||||
private File directory = null;
|
||||
private int refCount;
|
||||
|
||||
protected FSDirectory() {}; // permit subclassing
|
||||
|
||||
private void init(File path, boolean create) throws IOException {
|
||||
private void init(File path, boolean create, boolean doRemoveOldFiles) throws IOException {
|
||||
directory = path;
|
||||
|
||||
if (create) {
|
||||
create();
|
||||
create(doRemoveOldFiles);
|
||||
}
|
||||
|
||||
if (!directory.isDirectory())
|
||||
throw new IOException(path + " not a directory");
|
||||
}
|
||||
|
||||
private void init(File path, boolean create, LockFactory lockFactory) throws IOException {
|
||||
private void init(File path, boolean create, LockFactory lockFactory, boolean doRemoveOldFiles) throws IOException {
|
||||
|
||||
// Set up lockFactory with cascaded defaults: if an instance was passed in,
|
||||
// use that; else if locks are disabled, use NoLockFactory; else if the
|
||||
|
@ -280,10 +298,10 @@ public class FSDirectory extends Directory {
|
|||
|
||||
setLockFactory(lockFactory);
|
||||
|
||||
init(path, create);
|
||||
init(path, create, doRemoveOldFiles);
|
||||
}
|
||||
|
||||
private synchronized void create() throws IOException {
|
||||
private synchronized void create(boolean doRemoveOldFiles) throws IOException {
|
||||
if (!directory.exists())
|
||||
if (!directory.mkdirs())
|
||||
throw new IOException("Cannot create directory: " + directory);
|
||||
|
@ -291,13 +309,15 @@ public class FSDirectory extends Directory {
|
|||
if (!directory.isDirectory())
|
||||
throw new IOException(directory + " not a directory");
|
||||
|
||||
String[] files = directory.list(new IndexFileNameFilter()); // clear old files
|
||||
if (files == null)
|
||||
throw new IOException("Cannot read directory " + directory.getAbsolutePath());
|
||||
for (int i = 0; i < files.length; i++) {
|
||||
File file = new File(directory, files[i]);
|
||||
if (!file.delete())
|
||||
throw new IOException("Cannot delete " + file);
|
||||
if (doRemoveOldFiles) {
|
||||
String[] files = directory.list(IndexFileNameFilter.getFilter()); // clear old files
|
||||
if (files == null)
|
||||
throw new IOException("Cannot read directory " + directory.getAbsolutePath());
|
||||
for (int i = 0; i < files.length; i++) {
|
||||
File file = new File(directory, files[i]);
|
||||
if (!file.delete())
|
||||
throw new IOException("Cannot delete " + file);
|
||||
}
|
||||
}
|
||||
|
||||
lockFactory.clearAllLocks();
|
||||
|
@ -305,7 +325,7 @@ public class FSDirectory extends Directory {
|
|||
|
||||
/** Returns an array of strings, one for each Lucene index file in the directory. */
|
||||
public String[] list() {
|
||||
return directory.list(new IndexFileNameFilter());
|
||||
return directory.list(IndexFileNameFilter.getFilter());
|
||||
}
|
||||
|
||||
/** Returns true iff a file with the given name exists. */
|
||||
|
|
|
@ -18,6 +18,7 @@ package org.apache.lucene.store;
|
|||
*/
|
||||
|
||||
import java.io.IOException;
|
||||
import java.io.FileNotFoundException;
|
||||
import java.io.File;
|
||||
import java.io.Serializable;
|
||||
import java.util.Hashtable;
|
||||
|
@ -105,7 +106,7 @@ public final class RAMDirectory extends Directory implements Serializable {
|
|||
}
|
||||
|
||||
/** Returns an array of strings, one for each file in the directory. */
|
||||
public final String[] list() {
|
||||
public synchronized final String[] list() {
|
||||
String[] result = new String[files.size()];
|
||||
int i = 0;
|
||||
Enumeration names = files.keys();
|
||||
|
@ -129,7 +130,7 @@ public final class RAMDirectory extends Directory implements Serializable {
|
|||
/** Set the modified time of an existing file to now. */
|
||||
public void touchFile(String name) {
|
||||
// final boolean MONITOR = false;
|
||||
|
||||
|
||||
RAMFile file = (RAMFile)files.get(name);
|
||||
long ts2, ts1 = System.currentTimeMillis();
|
||||
do {
|
||||
|
@ -175,8 +176,11 @@ public final class RAMDirectory extends Directory implements Serializable {
|
|||
}
|
||||
|
||||
/** Returns a stream reading an existing file. */
|
||||
public final IndexInput openInput(String name) {
|
||||
public final IndexInput openInput(String name) throws IOException {
|
||||
RAMFile file = (RAMFile)files.get(name);
|
||||
if (file == null) {
|
||||
throw new FileNotFoundException(name);
|
||||
}
|
||||
return new RAMInputStream(file);
|
||||
}
|
||||
|
||||
|
|
|
@ -32,6 +32,7 @@ import org.apache.lucene.document.Field;
|
|||
|
||||
import java.util.Collection;
|
||||
import java.io.IOException;
|
||||
import java.io.FileNotFoundException;
|
||||
import java.io.File;
|
||||
|
||||
public class TestIndexReader extends TestCase
|
||||
|
@ -222,6 +223,11 @@ public class TestIndexReader extends TestCase
|
|||
assertEquals("deleted count", 100, deleted);
|
||||
assertEquals("deleted docFreq", 100, reader.docFreq(searchTerm));
|
||||
assertTermDocsCount("deleted termDocs", reader, searchTerm, 0);
|
||||
|
||||
// open a 2nd reader to make sure first reader can
|
||||
// commit its changes (.del) while second reader
|
||||
// is open:
|
||||
IndexReader reader2 = IndexReader.open(dir);
|
||||
reader.close();
|
||||
|
||||
// CREATE A NEW READER and re-test
|
||||
|
@ -231,10 +237,73 @@ public class TestIndexReader extends TestCase
|
|||
reader.close();
|
||||
}
|
||||
|
||||
// Make sure you can set norms & commit even if a reader
|
||||
// is open against the index:
|
||||
public void testWritingNorms() throws IOException
|
||||
{
|
||||
String tempDir = System.getProperty("tempDir");
|
||||
if (tempDir == null)
|
||||
throw new IOException("tempDir undefined, cannot run test");
|
||||
|
||||
File indexDir = new File(tempDir, "lucenetestnormwriter");
|
||||
Directory dir = FSDirectory.getDirectory(indexDir, true);
|
||||
IndexWriter writer = null;
|
||||
IndexReader reader = null;
|
||||
Term searchTerm = new Term("content", "aaa");
|
||||
|
||||
// add 1 documents with term : aaa
|
||||
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
addDoc(writer, searchTerm.text());
|
||||
writer.close();
|
||||
|
||||
// now open reader & set norm for doc 0
|
||||
reader = IndexReader.open(dir);
|
||||
reader.setNorm(0, "content", (float) 2.0);
|
||||
|
||||
// we should be holding the write lock now:
|
||||
assertTrue("locked", IndexReader.isLocked(dir));
|
||||
|
||||
reader.commit();
|
||||
|
||||
// we should not be holding the write lock now:
|
||||
assertTrue("not locked", !IndexReader.isLocked(dir));
|
||||
|
||||
// open a 2nd reader:
|
||||
IndexReader reader2 = IndexReader.open(dir);
|
||||
|
||||
// set norm again for doc 0
|
||||
reader.setNorm(0, "content", (float) 3.0);
|
||||
assertTrue("locked", IndexReader.isLocked(dir));
|
||||
|
||||
reader.close();
|
||||
|
||||
// we should not be holding the write lock now:
|
||||
assertTrue("not locked", !IndexReader.isLocked(dir));
|
||||
|
||||
reader2.close();
|
||||
dir.close();
|
||||
|
||||
rmDir(indexDir);
|
||||
}
|
||||
|
||||
|
||||
public void testDeleteReaderWriterConflictUnoptimized() throws IOException{
|
||||
deleteReaderWriterConflict(false);
|
||||
}
|
||||
|
||||
public void testOpenEmptyDirectory() throws IOException{
|
||||
String dirName = "test.empty";
|
||||
File fileDirName = new File(dirName);
|
||||
if (!fileDirName.exists()) {
|
||||
fileDirName.mkdir();
|
||||
}
|
||||
try {
|
||||
IndexReader reader = IndexReader.open(fileDirName);
|
||||
fail("opening IndexReader on empty directory failed to produce FileNotFoundException");
|
||||
} catch (FileNotFoundException e) {
|
||||
// GOOD
|
||||
}
|
||||
}
|
||||
|
||||
public void testDeleteReaderWriterConflictOptimized() throws IOException{
|
||||
deleteReaderWriterConflict(true);
|
||||
|
@ -368,12 +437,36 @@ public class TestIndexReader extends TestCase
|
|||
assertFalse(IndexReader.isLocked(dir)); // reader only, no lock
|
||||
long version = IndexReader.lastModified(dir);
|
||||
reader.close();
|
||||
// modify index and check version has been incremented:
|
||||
// modify index and check version has been
|
||||
// incremented:
|
||||
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
addDocumentWithFields(writer);
|
||||
writer.close();
|
||||
reader = IndexReader.open(dir);
|
||||
assertTrue(version < IndexReader.getCurrentVersion(dir));
|
||||
assertTrue("old lastModified is " + version + "; new lastModified is " + IndexReader.lastModified(dir), version <= IndexReader.lastModified(dir));
|
||||
reader.close();
|
||||
}
|
||||
|
||||
public void testVersion() throws IOException {
|
||||
assertFalse(IndexReader.indexExists("there_is_no_such_index"));
|
||||
Directory dir = new RAMDirectory();
|
||||
assertFalse(IndexReader.indexExists(dir));
|
||||
IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
addDocumentWithFields(writer);
|
||||
assertTrue(IndexReader.isLocked(dir)); // writer open, so dir is locked
|
||||
writer.close();
|
||||
assertTrue(IndexReader.indexExists(dir));
|
||||
IndexReader reader = IndexReader.open(dir);
|
||||
assertFalse(IndexReader.isLocked(dir)); // reader only, no lock
|
||||
long version = IndexReader.getCurrentVersion(dir);
|
||||
reader.close();
|
||||
// modify index and check version has been
|
||||
// incremented:
|
||||
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
addDocumentWithFields(writer);
|
||||
writer.close();
|
||||
reader = IndexReader.open(dir);
|
||||
assertTrue("old version is " + version + "; new version is " + IndexReader.getCurrentVersion(dir), version < IndexReader.getCurrentVersion(dir));
|
||||
reader.close();
|
||||
}
|
||||
|
||||
|
@ -412,6 +505,40 @@ public class TestIndexReader extends TestCase
|
|||
reader.close();
|
||||
}
|
||||
|
||||
public void testUndeleteAllAfterClose() throws IOException {
|
||||
Directory dir = new RAMDirectory();
|
||||
IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
addDocumentWithFields(writer);
|
||||
addDocumentWithFields(writer);
|
||||
writer.close();
|
||||
IndexReader reader = IndexReader.open(dir);
|
||||
reader.deleteDocument(0);
|
||||
reader.deleteDocument(1);
|
||||
reader.close();
|
||||
reader = IndexReader.open(dir);
|
||||
reader.undeleteAll();
|
||||
assertEquals(2, reader.numDocs()); // nothing has really been deleted thanks to undeleteAll()
|
||||
reader.close();
|
||||
}
|
||||
|
||||
public void testUndeleteAllAfterCloseThenReopen() throws IOException {
|
||||
Directory dir = new RAMDirectory();
|
||||
IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
addDocumentWithFields(writer);
|
||||
addDocumentWithFields(writer);
|
||||
writer.close();
|
||||
IndexReader reader = IndexReader.open(dir);
|
||||
reader.deleteDocument(0);
|
||||
reader.deleteDocument(1);
|
||||
reader.close();
|
||||
reader = IndexReader.open(dir);
|
||||
reader.undeleteAll();
|
||||
reader.close();
|
||||
reader = IndexReader.open(dir);
|
||||
assertEquals(2, reader.numDocs()); // nothing has really been deleted thanks to undeleteAll()
|
||||
reader.close();
|
||||
}
|
||||
|
||||
public void testDeleteReaderReaderConflictUnoptimized() throws IOException{
|
||||
deleteReaderReaderConflict(false);
|
||||
}
|
||||
|
@ -562,4 +689,11 @@ public class TestIndexReader extends TestCase
|
|||
doc.add(new Field("content", value, Field.Store.NO, Field.Index.TOKENIZED));
|
||||
writer.addDocument(doc);
|
||||
}
|
||||
private void rmDir(File dir) {
|
||||
File[] files = dir.listFiles();
|
||||
for (int i = 0; i < files.length; i++) {
|
||||
files[i].delete();
|
||||
}
|
||||
dir.delete();
|
||||
}
|
||||
}
|
||||
|
|
|
@ -1,6 +1,7 @@
|
|||
package org.apache.lucene.index;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.io.File;
|
||||
|
||||
import junit.framework.TestCase;
|
||||
|
||||
|
@ -10,7 +11,10 @@ import org.apache.lucene.document.Field;
|
|||
import org.apache.lucene.index.IndexReader;
|
||||
import org.apache.lucene.index.IndexWriter;
|
||||
import org.apache.lucene.store.Directory;
|
||||
import org.apache.lucene.store.FSDirectory;
|
||||
import org.apache.lucene.store.RAMDirectory;
|
||||
import org.apache.lucene.store.IndexInput;
|
||||
import org.apache.lucene.store.IndexOutput;
|
||||
|
||||
|
||||
/**
|
||||
|
@ -28,14 +32,11 @@ public class TestIndexWriter extends TestCase
|
|||
int i;
|
||||
|
||||
IndexWriter.setDefaultWriteLockTimeout(2000);
|
||||
IndexWriter.setDefaultCommitLockTimeout(2000);
|
||||
assertEquals(2000, IndexWriter.getDefaultWriteLockTimeout());
|
||||
assertEquals(2000, IndexWriter.getDefaultCommitLockTimeout());
|
||||
|
||||
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
|
||||
IndexWriter.setDefaultWriteLockTimeout(1000);
|
||||
IndexWriter.setDefaultCommitLockTimeout(1000);
|
||||
|
||||
// add 100 documents
|
||||
for (i = 0; i < 100; i++) {
|
||||
|
@ -72,6 +73,12 @@ public class TestIndexWriter extends TestCase
|
|||
assertEquals(60, reader.maxDoc());
|
||||
assertEquals(60, reader.numDocs());
|
||||
reader.close();
|
||||
|
||||
// make sure opening a new index for create over
|
||||
// this existing one works correctly:
|
||||
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
assertEquals(0, writer.docCount());
|
||||
writer.close();
|
||||
}
|
||||
|
||||
private void addDoc(IndexWriter writer) throws IOException
|
||||
|
@ -80,4 +87,192 @@ public class TestIndexWriter extends TestCase
|
|||
doc.add(new Field("content", "aaa", Field.Store.NO, Field.Index.TOKENIZED));
|
||||
writer.addDocument(doc);
|
||||
}
|
||||
|
||||
// Make sure we can open an index for create even when a
|
||||
// reader holds it open (this fails pre lock-less
|
||||
// commits on windows):
|
||||
public void testCreateWithReader() throws IOException {
|
||||
String tempDir = System.getProperty("java.io.tmpdir");
|
||||
if (tempDir == null)
|
||||
throw new IOException("java.io.tmpdir undefined, cannot run test");
|
||||
File indexDir = new File(tempDir, "lucenetestindexwriter");
|
||||
Directory dir = FSDirectory.getDirectory(indexDir, true);
|
||||
|
||||
// add one document & close writer
|
||||
IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
addDoc(writer);
|
||||
writer.close();
|
||||
|
||||
// now open reader:
|
||||
IndexReader reader = IndexReader.open(dir);
|
||||
assertEquals("should be one document", reader.numDocs(), 1);
|
||||
|
||||
// now open index for create:
|
||||
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
assertEquals("should be zero documents", writer.docCount(), 0);
|
||||
addDoc(writer);
|
||||
writer.close();
|
||||
|
||||
assertEquals("should be one document", reader.numDocs(), 1);
|
||||
IndexReader reader2 = IndexReader.open(dir);
|
||||
assertEquals("should be one document", reader2.numDocs(), 1);
|
||||
reader.close();
|
||||
reader2.close();
|
||||
rmDir(indexDir);
|
||||
}
|
||||
|
||||
// Simulate a writer that crashed while writing segments
|
||||
// file: make sure we can still open the index (ie,
|
||||
// gracefully fallback to the previous segments file),
|
||||
// and that we can add to the index:
|
||||
public void testSimulatedCrashedWriter() throws IOException {
|
||||
Directory dir = new RAMDirectory();
|
||||
|
||||
IndexWriter writer = null;
|
||||
|
||||
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
|
||||
// add 100 documents
|
||||
for (int i = 0; i < 100; i++) {
|
||||
addDoc(writer);
|
||||
}
|
||||
|
||||
// close
|
||||
writer.close();
|
||||
|
||||
long gen = SegmentInfos.getCurrentSegmentGeneration(dir);
|
||||
assertTrue("segment generation should be > 1 but got " + gen, gen > 1);
|
||||
|
||||
// Make the next segments file, with last byte
|
||||
// missing, to simulate a writer that crashed while
|
||||
// writing segments file:
|
||||
String fileNameIn = SegmentInfos.getCurrentSegmentFileName(dir);
|
||||
String fileNameOut = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||
"",
|
||||
1+gen);
|
||||
IndexInput in = dir.openInput(fileNameIn);
|
||||
IndexOutput out = dir.createOutput(fileNameOut);
|
||||
long length = in.length();
|
||||
for(int i=0;i<length-1;i++) {
|
||||
out.writeByte(in.readByte());
|
||||
}
|
||||
in.close();
|
||||
out.close();
|
||||
|
||||
IndexReader reader = null;
|
||||
try {
|
||||
reader = IndexReader.open(dir);
|
||||
} catch (Exception e) {
|
||||
fail("reader failed to open on a crashed index");
|
||||
}
|
||||
reader.close();
|
||||
|
||||
try {
|
||||
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
} catch (Exception e) {
|
||||
fail("writer failed to open on a crashed index");
|
||||
}
|
||||
|
||||
// add 100 documents
|
||||
for (int i = 0; i < 100; i++) {
|
||||
addDoc(writer);
|
||||
}
|
||||
|
||||
// close
|
||||
writer.close();
|
||||
}
|
||||
|
||||
// Simulate a corrupt index by removing last byte of
|
||||
// latest segments file and make sure we get an
|
||||
// IOException trying to open the index:
|
||||
public void testSimulatedCorruptIndex1() throws IOException {
|
||||
Directory dir = new RAMDirectory();
|
||||
|
||||
IndexWriter writer = null;
|
||||
|
||||
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
|
||||
// add 100 documents
|
||||
for (int i = 0; i < 100; i++) {
|
||||
addDoc(writer);
|
||||
}
|
||||
|
||||
// close
|
||||
writer.close();
|
||||
|
||||
long gen = SegmentInfos.getCurrentSegmentGeneration(dir);
|
||||
assertTrue("segment generation should be > 1 but got " + gen, gen > 1);
|
||||
|
||||
String fileNameIn = SegmentInfos.getCurrentSegmentFileName(dir);
|
||||
String fileNameOut = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||
"",
|
||||
1+gen);
|
||||
IndexInput in = dir.openInput(fileNameIn);
|
||||
IndexOutput out = dir.createOutput(fileNameOut);
|
||||
long length = in.length();
|
||||
for(int i=0;i<length-1;i++) {
|
||||
out.writeByte(in.readByte());
|
||||
}
|
||||
in.close();
|
||||
out.close();
|
||||
dir.deleteFile(fileNameIn);
|
||||
|
||||
IndexReader reader = null;
|
||||
try {
|
||||
reader = IndexReader.open(dir);
|
||||
fail("reader did not hit IOException on opening a corrupt index");
|
||||
} catch (Exception e) {
|
||||
}
|
||||
if (reader != null) {
|
||||
reader.close();
|
||||
}
|
||||
}
|
||||
|
||||
// Simulate a corrupt index by removing one of the cfs
|
||||
// files and make sure we get an IOException trying to
|
||||
// open the index:
|
||||
public void testSimulatedCorruptIndex2() throws IOException {
|
||||
Directory dir = new RAMDirectory();
|
||||
|
||||
IndexWriter writer = null;
|
||||
|
||||
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||
|
||||
// add 100 documents
|
||||
for (int i = 0; i < 100; i++) {
|
||||
addDoc(writer);
|
||||
}
|
||||
|
||||
// close
|
||||
writer.close();
|
||||
|
||||
long gen = SegmentInfos.getCurrentSegmentGeneration(dir);
|
||||
assertTrue("segment generation should be > 1 but got " + gen, gen > 1);
|
||||
|
||||
String[] files = dir.list();
|
||||
for(int i=0;i<files.length;i++) {
|
||||
if (files[i].endsWith(".cfs")) {
|
||||
dir.deleteFile(files[i]);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
IndexReader reader = null;
|
||||
try {
|
||||
reader = IndexReader.open(dir);
|
||||
fail("reader did not hit IOException on opening a corrupt index");
|
||||
} catch (Exception e) {
|
||||
}
|
||||
if (reader != null) {
|
||||
reader.close();
|
||||
}
|
||||
}
|
||||
|
||||
private void rmDir(File dir) {
|
||||
File[] files = dir.listFiles();
|
||||
for (int i = 0; i < files.length; i++) {
|
||||
files[i].delete();
|
||||
}
|
||||
dir.delete();
|
||||
}
|
||||
}
|
||||
|
|
|
@ -80,6 +80,21 @@ public class TestMultiReader extends TestCase {
|
|||
assertEquals( 1, reader.numDocs() );
|
||||
reader.undeleteAll();
|
||||
assertEquals( 2, reader.numDocs() );
|
||||
|
||||
// Ensure undeleteAll survives commit/close/reopen:
|
||||
reader.commit();
|
||||
reader.close();
|
||||
sis.read(dir);
|
||||
reader = new MultiReader(dir, sis, false, readers);
|
||||
assertEquals( 2, reader.numDocs() );
|
||||
|
||||
reader.deleteDocument(0);
|
||||
assertEquals( 1, reader.numDocs() );
|
||||
reader.commit();
|
||||
reader.close();
|
||||
sis.read(dir);
|
||||
reader = new MultiReader(dir, sis, false, readers);
|
||||
assertEquals( 1, reader.numDocs() );
|
||||
}
|
||||
|
||||
|
||||
|
|
Binary file not shown.
Binary file not shown.
|
@ -58,9 +58,9 @@ public class TestLockFactory extends TestCase {
|
|||
|
||||
// Both write lock and commit lock should have been created:
|
||||
assertEquals("# of unique locks created (after instantiating IndexWriter)",
|
||||
2, lf.locksCreated.size());
|
||||
assertTrue("# calls to makeLock <= 2 (after instantiating IndexWriter)",
|
||||
lf.makeLockCount > 2);
|
||||
1, lf.locksCreated.size());
|
||||
assertTrue("# calls to makeLock is 0 (after instantiating IndexWriter)",
|
||||
lf.makeLockCount >= 1);
|
||||
|
||||
for(Enumeration e = lf.locksCreated.keys(); e.hasMoreElements();) {
|
||||
String lockName = (String) e.nextElement();
|
||||
|
@ -90,6 +90,7 @@ public class TestLockFactory extends TestCase {
|
|||
try {
|
||||
writer2 = new IndexWriter(dir, new WhitespaceAnalyzer(), false);
|
||||
} catch (Exception e) {
|
||||
e.printStackTrace(System.out);
|
||||
fail("Should not have hit an IOException with no locking");
|
||||
}
|
||||
|
||||
|
@ -234,6 +235,7 @@ public class TestLockFactory extends TestCase {
|
|||
try {
|
||||
writer2 = new IndexWriter(indexDirName, new WhitespaceAnalyzer(), false);
|
||||
} catch (IOException e) {
|
||||
e.printStackTrace(System.out);
|
||||
fail("Should not have hit an IOException with locking disabled");
|
||||
}
|
||||
|
||||
|
@ -266,6 +268,7 @@ public class TestLockFactory extends TestCase {
|
|||
try {
|
||||
fs2 = FSDirectory.getDirectory(indexDirName, true, lf);
|
||||
} catch (IOException e) {
|
||||
e.printStackTrace(System.out);
|
||||
fail("Should not have hit an IOException because LockFactory instances are the same");
|
||||
}
|
||||
|
||||
|
@ -294,7 +297,6 @@ public class TestLockFactory extends TestCase {
|
|||
|
||||
public void _testStressLocks(LockFactory lockFactory, String indexDirName) throws IOException {
|
||||
FSDirectory fs1 = FSDirectory.getDirectory(indexDirName, true, lockFactory);
|
||||
// fs1.setLockFactory(NoLockFactory.getNoLockFactory());
|
||||
|
||||
// First create a 1 doc index:
|
||||
IndexWriter w = new IndexWriter(fs1, new WhitespaceAnalyzer(), true);
|
||||
|
@ -405,6 +407,7 @@ public class TestLockFactory extends TestCase {
|
|||
hitException = true;
|
||||
System.out.println("Stress Test Index Writer: creation hit unexpected exception: " + e.toString());
|
||||
e.printStackTrace(System.out);
|
||||
break;
|
||||
}
|
||||
if (writer != null) {
|
||||
try {
|
||||
|
@ -413,6 +416,7 @@ public class TestLockFactory extends TestCase {
|
|||
hitException = true;
|
||||
System.out.println("Stress Test Index Writer: addDoc hit unexpected exception: " + e.toString());
|
||||
e.printStackTrace(System.out);
|
||||
break;
|
||||
}
|
||||
try {
|
||||
writer.close();
|
||||
|
@ -420,6 +424,7 @@ public class TestLockFactory extends TestCase {
|
|||
hitException = true;
|
||||
System.out.println("Stress Test Index Writer: close hit unexpected exception: " + e.toString());
|
||||
e.printStackTrace(System.out);
|
||||
break;
|
||||
}
|
||||
writer = null;
|
||||
}
|
||||
|
@ -446,6 +451,7 @@ public class TestLockFactory extends TestCase {
|
|||
hitException = true;
|
||||
System.out.println("Stress Test Index Searcher: create hit unexpected exception: " + e.toString());
|
||||
e.printStackTrace(System.out);
|
||||
break;
|
||||
}
|
||||
if (searcher != null) {
|
||||
Hits hits = null;
|
||||
|
@ -455,6 +461,7 @@ public class TestLockFactory extends TestCase {
|
|||
hitException = true;
|
||||
System.out.println("Stress Test Index Searcher: search hit unexpected exception: " + e.toString());
|
||||
e.printStackTrace(System.out);
|
||||
break;
|
||||
}
|
||||
// System.out.println(hits.length() + " total results");
|
||||
try {
|
||||
|
@ -463,6 +470,7 @@ public class TestLockFactory extends TestCase {
|
|||
hitException = true;
|
||||
System.out.println("Stress Test Index Searcher: close hit unexpected exception: " + e.toString());
|
||||
e.printStackTrace(System.out);
|
||||
break;
|
||||
}
|
||||
searcher = null;
|
||||
}
|
||||
|
|
|
@ -14,7 +14,7 @@
|
|||
|
||||
<p>
|
||||
This document defines the index file formats used
|
||||
in Lucene version 2.0. If you are using a different
|
||||
in Lucene version 2.1. If you are using a different
|
||||
version of Lucene, please consult the copy of
|
||||
<code>docs/fileformats.html</code> that was distributed
|
||||
with the version you are using.
|
||||
|
@ -43,6 +43,18 @@
|
|||
describing how file formats have changed from prior versions.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
In version 2.1, the file format was changed to allow
|
||||
lock-less commits (ie, no more commit lock). The
|
||||
change is fully backwards compatible: you can open a
|
||||
pre-2.1 index for searching or adding/deleting of
|
||||
docs. When the new segments file is saved
|
||||
(committed), it will be written in the new file format
|
||||
(meaning no specific "upgrade" process is needed).
|
||||
But note that once a commit has occurred, pre-2.1
|
||||
Lucene will not be able to read the index.
|
||||
</p>
|
||||
|
||||
</section>
|
||||
|
||||
<section name="Definitions">
|
||||
|
@ -260,6 +272,18 @@
|
|||
required.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
As of version 2.1 (lock-less commits), file names are
|
||||
never re-used (there is one exception, "segments.gen",
|
||||
see below). That is, when any file is saved to the
|
||||
Directory it is given a never before used filename.
|
||||
This is achieved using a simple generations approach.
|
||||
For example, the first segments file is segments_1,
|
||||
then segments_2, etc. The generation is a sequential
|
||||
long integer represented in alpha-numeric (base 36)
|
||||
form.
|
||||
</p>
|
||||
|
||||
</section>
|
||||
|
||||
<section name="Primitive Types">
|
||||
|
@ -696,22 +720,48 @@
|
|||
|
||||
<p>
|
||||
The active segments in the index are stored in the
|
||||
segment info file. An index only has
|
||||
a single file in this format, and it is named "segments".
|
||||
This lists each segment by name, and also contains the size of each
|
||||
segment.
|
||||
segment info file, <tt>segments_N</tt>. There may
|
||||
be one or more <tt>segments_N</tt> files in the
|
||||
index; however, the one with the largest
|
||||
generation is the active one (when older
|
||||
segments_N files are present it's because they
|
||||
temporarily cannot be deleted, or, a writer is in
|
||||
the process of committing). This file lists each
|
||||
segment by name, has details about the separate
|
||||
norms and deletion files, and also contains the
|
||||
size of each segment.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
As of 2.1, there is also a file
|
||||
<tt>segments.gen</tt>. This file contains the
|
||||
current generation (the <tt>_N</tt> in
|
||||
<tt>segments_N</tt>) of the index. This is
|
||||
used only as a fallback in case the current
|
||||
generation cannot be accurately determined by
|
||||
directory listing alone (as is the case for some
|
||||
NFS clients with time-based directory cache
|
||||
expiraation). This file simply contains an Int32
|
||||
version header (SegmentInfos.FORMAT_LOCKLESS =
|
||||
-2), followed by the generation recorded as Int64,
|
||||
written twice.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
<b>Pre-2.1:</b>
|
||||
Segments --> Format, Version, NameCounter, SegCount, <SegName, SegSize><sup>SegCount</sup>
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Format, NameCounter, SegCount, SegSize --> UInt32
|
||||
<p>
|
||||
<b>2.1 and above:</b>
|
||||
Segments --> Format, Version, NameCounter, SegCount, <SegName, SegSize, DelGen, NumField, NormGen<sup>NumField</sup> ><sup>SegCount</sup>, IsCompoundFile
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Version --> UInt64
|
||||
Format, NameCounter, SegCount, SegSize, NumField --> Int32
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Version, DelGen, NormGen --> Int64
|
||||
</p>
|
||||
|
||||
<p>
|
||||
|
@ -719,7 +769,11 @@
|
|||
</p>
|
||||
|
||||
<p>
|
||||
Format is -1 in Lucene 1.4.
|
||||
IsCompoundFile --> Int8
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Format is -1 as of Lucene 1.4 and -2 as of Lucene 2.1.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
|
@ -740,65 +794,79 @@
|
|||
SegSize is the number of documents contained in the segment index.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
DelGen is the generation count of the separate
|
||||
deletes file. If this is -1, there are no
|
||||
separate deletes. If it is 0, this is a pre-2.1
|
||||
segment and you must check filesystem for the
|
||||
existence of _X.del. Anything above zero means
|
||||
there are separate deletes (_X_N.del).
|
||||
</p>
|
||||
|
||||
<p>
|
||||
NumField is the size of the array for NormGen, or
|
||||
-1 if there are no NormGens stored.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
NormGen records the generation of the separate
|
||||
norms files. If NumField is -1, there are no
|
||||
normGens stored and they are all assumed to be 0
|
||||
when the segment file was written pre-2.1 and all
|
||||
assumed to be -1 when the segments file is 2.1 or
|
||||
above. The generation then has the same meaning
|
||||
as delGen (above).
|
||||
</p>
|
||||
|
||||
<p>
|
||||
IsCompoundFile records whether the segment is
|
||||
written as a compound file or not. If this is -1,
|
||||
the segment is not a compound file. If it is 1,
|
||||
the segment is a compound file. Else it is 0,
|
||||
which means we check filesystem to see if _X.cfs
|
||||
exists.
|
||||
</p>
|
||||
|
||||
|
||||
</subsection>
|
||||
|
||||
<subsection name="Lock Files">
|
||||
<subsection name="Lock File">
|
||||
|
||||
<p>
|
||||
Several files are used to indicate that another
|
||||
process is using an index. Note that these files are not
|
||||
A write lock is used to indicate that another
|
||||
process is writing to the index. Note that this file is not
|
||||
stored in the index directory itself, but rather in the
|
||||
system's temporary directory, as indicated in the Java
|
||||
system property "java.io.tmpdir".
|
||||
</p>
|
||||
|
||||
<ul>
|
||||
<li>
|
||||
<p>
|
||||
When a file named "commit.lock"
|
||||
is present, a process is currently re-writing the "segments"
|
||||
file and deleting outdated segment index files, or a process is
|
||||
reading the "segments"
|
||||
file and opening the files of the segments it names. This lock file
|
||||
prevents files from being deleted by another process after a process
|
||||
has read the "segments"
|
||||
file but before it has managed to open all of the files of the
|
||||
segments named therein.
|
||||
</p>
|
||||
</li>
|
||||
<p>
|
||||
The write lock is named "XXXX-write.lock" where
|
||||
XXXX is typically a unique prefix computed by the
|
||||
directory path to the index. When this file is
|
||||
present, a process is currently adding documents
|
||||
to an index, or removing files from that index.
|
||||
This lock file prevents several processes from
|
||||
attempting to modify an index at the same time.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Note that prior to version 2.1, Lucene also used a
|
||||
commit lock. This was removed in 2.1.
|
||||
</p>
|
||||
|
||||
<li>
|
||||
<p>
|
||||
When a file named "write.lock"
|
||||
is present, a process is currently adding documents to an index, or
|
||||
removing files from that index. This lock file prevents several
|
||||
processes from attempting to modify an index at the same time.
|
||||
</p>
|
||||
</li>
|
||||
</ul>
|
||||
</subsection>
|
||||
|
||||
<subsection name="Deletable File">
|
||||
|
||||
<p>
|
||||
A file named "deletable"
|
||||
contains the names of files that are no longer used by the index, but
|
||||
which could not be deleted. This is only used on Win32, where a
|
||||
file may not be deleted while it is still open. On other platforms
|
||||
the file contains only null bytes.
|
||||
Prior to Lucene 2.1 there was a file "deletable"
|
||||
that contained details about files that need to be
|
||||
deleted. As of 2.1, a writer dynamically computes
|
||||
the files that are deletable, instead, so no file
|
||||
is written.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Deletable --> DeletableCount,
|
||||
<DelableName><sup>DeletableCount</sup>
|
||||
</p>
|
||||
|
||||
<p>DeletableCount --> UInt32
|
||||
</p>
|
||||
<p>DeletableName -->
|
||||
String
|
||||
</p>
|
||||
</subsection>
|
||||
|
||||
<subsection name="Compound Files">
|
||||
|
|
Loading…
Reference in New Issue