Lockless commits

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@476359 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Michael McCandless 2006-11-17 23:18:47 +00:00
parent bd6f012511
commit d634ccf4e9
20 changed files with 1956 additions and 493 deletions

View File

@ -104,6 +104,15 @@ API Changes
9. LUCENE-657: Made FuzzyQuery non-final and inner ScoreTerm protected. 9. LUCENE-657: Made FuzzyQuery non-final and inner ScoreTerm protected.
(Steven Parkes via Otis Gospodnetic) (Steven Parkes via Otis Gospodnetic)
10. LUCENE-701: Lockless commits: a commit lock is no longer required
when a writer commits and a reader opens the index. This includes
a change to the index file format (see docs/fileformats.html for
details). It also removes all APIs associated with the commit
lock & its timeout. Readers are now truly read-only and do not
block one another on startup. This is the first step to getting
Lucene to work correctly over NFS (second step is
LUCENE-710). (Mike McCandless)
Bug fixes Bug fixes
1. Fixed the web application demo (built with "ant war-demo") which 1. Fixed the web application demo (built with "ant war-demo") which

View File

@ -118,7 +118,7 @@ limitations under the License.
<blockquote> <blockquote>
<p> <p>
This document defines the index file formats used This document defines the index file formats used
in Lucene version 2.0. If you are using a different in Lucene version 2.1. If you are using a different
version of Lucene, please consult the copy of version of Lucene, please consult the copy of
<code>docs/fileformats.html</code> that was distributed <code>docs/fileformats.html</code> that was distributed
with the version you are using. with the version you are using.
@ -142,6 +142,17 @@ limitations under the License.
<p> <p>
Compatibility notes are provided in this document, Compatibility notes are provided in this document,
describing how file formats have changed from prior versions. describing how file formats have changed from prior versions.
</p>
<p>
In version 2.1, the file format was changed to allow
lock-less commits (ie, no more commit lock). The
change is fully backwards compatible: you can open a
pre-2.1 index for searching or adding/deleting of
docs. When the new segments file is saved
(committed), it will be written in the new file format
(meaning no specific "upgrade" process is needed).
But note that once a commit has occurred, pre-2.1
Lucene will not be able to read the index.
</p> </p>
</blockquote> </blockquote>
</p> </p>
@ -403,6 +414,17 @@ limitations under the License.
Typically, all segments Typically, all segments
in an index are stored in a single directory, although this is not in an index are stored in a single directory, although this is not
required. required.
</p>
<p>
As of version 2.1 (lock-less commits), file names are
never re-used (there is one exception, "segments.gen",
see below). That is, when any file is saved to the
Directory it is given a never before used filename.
This is achieved using a simple generations approach.
For example, the first segments file is segments_1,
then segments_2, etc. The generation is a sequential
long integer represented in alpha-numeric (base 36)
form.
</p> </p>
</blockquote> </blockquote>
</p> </p>
@ -1080,25 +1102,53 @@ limitations under the License.
<blockquote> <blockquote>
<p> <p>
The active segments in the index are stored in the The active segments in the index are stored in the
segment info file. An index only has segment info file, <tt>segments_N</tt>. There may
a single file in this format, and it is named "segments". be one or more <tt>segments_N</tt> files in the
This lists each segment by name, and also contains the size of each index; however, the one with the largest
segment. generation is the active one (when older
segments_N files are present it's because they
temporarily cannot be deleted, or, a writer is in
the process of committing). This file lists each
segment by name, has details about the separate
norms and deletion files, and also contains the
size of each segment.
</p> </p>
<p> <p>
As of 2.1, there is also a file
<tt>segments.gen</tt>. This file contains the
current generation (the <tt>_N</tt> in
<tt>segments_N</tt>) of the index. This is
used only as a fallback in case the current
generation cannot be accurately determined by
directory listing alone (as is the case for some
NFS clients with time-based directory cache
expiraation). This file simply contains an Int32
version header (SegmentInfos.FORMAT_LOCKLESS =
-2), followed by the generation recorded as Int64,
written twice.
</p>
<p>
<b>Pre-2.1:</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize&gt;<sup>SegCount</sup> Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize&gt;<sup>SegCount</sup>
</p> </p>
<p> <p>
Format, NameCounter, SegCount, SegSize --&gt; UInt32 <b>2.1 and above:</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize, DelGen, NumField, NormGen<sup>NumField</sup> &gt;<sup>SegCount</sup>, IsCompoundFile
</p> </p>
<p> <p>
Version --&gt; UInt64 Format, NameCounter, SegCount, SegSize, NumField --&gt; Int32
</p>
<p>
Version, DelGen, NormGen --&gt; Int64
</p> </p>
<p> <p>
SegName --&gt; String SegName --&gt; String
</p> </p>
<p> <p>
Format is -1 in Lucene 1.4. IsCompoundFile --&gt; Int8
</p>
<p>
Format is -1 as of Lucene 1.4 and -2 as of Lucene 2.1.
</p> </p>
<p> <p>
Version counts how often the index has been Version counts how often the index has been
@ -1113,6 +1163,35 @@ limitations under the License.
</p> </p>
<p> <p>
SegSize is the number of documents contained in the segment index. SegSize is the number of documents contained in the segment index.
</p>
<p>
DelGen is the generation count of the separate
deletes file. If this is -1, there are no
separate deletes. If it is 0, this is a pre-2.1
segment and you must check filesystem for the
existence of _X.del. Anything above zero means
there are separate deletes (_X_N.del).
</p>
<p>
NumField is the size of the array for NormGen, or
-1 if there are no NormGens stored.
</p>
<p>
NormGen records the generation of the separate
norms files. If NumField is -1, there are no
normGens stored and they are all assumed to be 0
when the segment file was written pre-2.1 and all
assumed to be -1 when the segments file is 2.1 or
above. The generation then has the same meaning
as delGen (above).
</p>
<p>
IsCompoundFile records whether the segment is
written as a compound file or not. If this is -1,
the segment is not a compound file. If it is 1,
the segment is a compound file. Else it is 0,
which means we check filesystem to see if _X.cfs
exists.
</p> </p>
</blockquote> </blockquote>
</td></tr> </td></tr>
@ -1121,42 +1200,31 @@ limitations under the License.
<table border="0" cellspacing="0" cellpadding="2" width="100%"> <table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#828DA6"> <tr><td bgcolor="#828DA6">
<font color="#ffffff" face="arial,helvetica,sanserif"> <font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Lock Files"><strong>Lock Files</strong></a> <a name="Lock File"><strong>Lock File</strong></a>
</font> </font>
</td></tr> </td></tr>
<tr><td> <tr><td>
<blockquote> <blockquote>
<p> <p>
Several files are used to indicate that another A write lock is used to indicate that another
process is using an index. Note that these files are not process is writing to the index. Note that this file is not
stored in the index directory itself, but rather in the stored in the index directory itself, but rather in the
system's temporary directory, as indicated in the Java system's temporary directory, as indicated in the Java
system property "java.io.tmpdir". system property "java.io.tmpdir".
</p> </p>
<ul>
<li>
<p> <p>
When a file named "commit.lock" The write lock is named "XXXX-write.lock" where
is present, a process is currently re-writing the "segments" XXXX is typically a unique prefix computed by the
file and deleting outdated segment index files, or a process is directory path to the index. When this file is
reading the "segments" present, a process is currently adding documents
file and opening the files of the segments it names. This lock file to an index, or removing files from that index.
prevents files from being deleted by another process after a process This lock file prevents several processes from
has read the "segments" attempting to modify an index at the same time.
file but before it has managed to open all of the files of the
segments named therein.
</p> </p>
</li>
<li>
<p> <p>
When a file named "write.lock" Note that prior to version 2.1, Lucene also used a
is present, a process is currently adding documents to an index, or commit lock. This was removed in 2.1.
removing files from that index. This lock file prevents several
processes from attempting to modify an index at the same time.
</p> </p>
</li>
</ul>
</blockquote> </blockquote>
</td></tr> </td></tr>
<tr><td><br/></td></tr> <tr><td><br/></td></tr>
@ -1170,20 +1238,11 @@ limitations under the License.
<tr><td> <tr><td>
<blockquote> <blockquote>
<p> <p>
A file named "deletable" Prior to Lucene 2.1 there was a file "deletable"
contains the names of files that are no longer used by the index, but that contained details about files that need to be
which could not be deleted. This is only used on Win32, where a deleted. As of 2.1, a writer dynamically computes
file may not be deleted while it is still open. On other platforms the files that are deletable, instead, so no file
the file contains only null bytes. is written.
</p>
<p>
Deletable --&gt; DeletableCount,
&lt;DelableName&gt;<sup>DeletableCount</sup>
</p>
<p>DeletableCount --&gt; UInt32
</p>
<p>DeletableName --&gt;
String
</p> </p>
</blockquote> </blockquote>
</td></tr> </td></tr>

View File

@ -0,0 +1,219 @@
package org.apache.lucene.index;
import org.apache.lucene.index.IndexFileNames;
import org.apache.lucene.index.IndexFileNameFilter;
import org.apache.lucene.index.SegmentInfos;
import org.apache.lucene.store.Directory;
import java.io.IOException;
import java.io.PrintStream;
import java.util.Vector;
import java.util.HashMap;
/**
* A utility class (used by both IndexReader and
* IndexWriter) to keep track of files that need to be
* deleted because they are no longer referenced by the
* index.
*/
public class IndexFileDeleter {
private Vector deletable;
private Vector pending;
private Directory directory;
private SegmentInfos segmentInfos;
private PrintStream infoStream;
public IndexFileDeleter(SegmentInfos segmentInfos, Directory directory)
throws IOException {
this.segmentInfos = segmentInfos;
this.directory = directory;
}
void setInfoStream(PrintStream infoStream) {
this.infoStream = infoStream;
}
/** Determine index files that are no longer referenced
* and therefore should be deleted. This is called once
* (by the writer), and then subsequently we add onto
* deletable any files that are no longer needed at the
* point that we create the unused file (eg when merging
* segments), and we only remove from deletable when a
* file is successfully deleted.
*/
public void findDeletableFiles() throws IOException {
// Gather all "current" segments:
HashMap current = new HashMap();
for(int j=0;j<segmentInfos.size();j++) {
SegmentInfo segmentInfo = (SegmentInfo) segmentInfos.elementAt(j);
current.put(segmentInfo.name, segmentInfo);
}
// Then go through all files in the Directory that are
// Lucene index files, and add to deletable if they are
// not referenced by the current segments info:
String segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
IndexFileNameFilter filter = IndexFileNameFilter.getFilter();
String[] files = directory.list();
for (int i = 0; i < files.length; i++) {
if (filter.accept(null, files[i]) && !files[i].equals(segmentsInfosFileName) && !files[i].equals(IndexFileNames.SEGMENTS_GEN)) {
String segmentName;
String extension;
// First remove any extension:
int loc = files[i].indexOf('.');
if (loc != -1) {
extension = files[i].substring(1+loc);
segmentName = files[i].substring(0, loc);
} else {
extension = null;
segmentName = files[i];
}
// Then, remove any generation count:
loc = segmentName.indexOf('_', 1);
if (loc != -1) {
segmentName = segmentName.substring(0, loc);
}
// Delete this file if it's not a "current" segment,
// or, it is a single index file but there is now a
// corresponding compound file:
boolean doDelete = false;
if (!current.containsKey(segmentName)) {
// Delete if segment is not referenced:
doDelete = true;
} else {
// OK, segment is referenced, but file may still
// be orphan'd:
SegmentInfo info = (SegmentInfo) current.get(segmentName);
if (filter.isCFSFile(files[i]) && info.getUseCompoundFile()) {
// This file is in fact stored in a CFS file for
// this segment:
doDelete = true;
} else {
if ("del".equals(extension)) {
// This is a _segmentName_N.del file:
if (!files[i].equals(info.getDelFileName())) {
// If this is a seperate .del file, but it
// doesn't match the current del filename for
// this segment, then delete it:
doDelete = true;
}
} else if (extension != null && extension.startsWith("s") && extension.matches("s\\d+")) {
int field = Integer.parseInt(extension.substring(1));
// This is a _segmentName_N.sX file:
if (!files[i].equals(info.getNormFileName(field))) {
// This is an orphan'd separate norms file:
doDelete = true;
}
}
}
}
if (doDelete) {
addDeletableFile(files[i]);
if (infoStream != null) {
infoStream.println("IndexFileDeleter: file \"" + files[i] + "\" is unreferenced in index and will be deleted on next commit");
}
}
}
}
}
/*
* Some operating systems (e.g. Windows) don't permit a file to be deleted
* while it is opened for read (e.g. by another process or thread). So we
* assume that when a delete fails it is because the file is open in another
* process, and queue the file for subsequent deletion.
*/
public final void deleteSegments(Vector segments) throws IOException {
deleteFiles(); // try to delete files that we couldn't before
for (int i = 0; i < segments.size(); i++) {
SegmentReader reader = (SegmentReader)segments.elementAt(i);
if (reader.directory() == this.directory)
deleteFiles(reader.files()); // try to delete our files
else
deleteFiles(reader.files(), reader.directory()); // delete other files
}
}
public final void deleteFiles(Vector files, Directory directory)
throws IOException {
for (int i = 0; i < files.size(); i++)
directory.deleteFile((String)files.elementAt(i));
}
public final void deleteFiles(Vector files)
throws IOException {
deleteFiles(); // try to delete files that we couldn't before
for (int i = 0; i < files.size(); i++) {
deleteFile((String) files.elementAt(i));
}
}
public final void deleteFile(String file)
throws IOException {
try {
directory.deleteFile(file); // try to delete each file
} catch (IOException e) { // if delete fails
if (directory.fileExists(file)) {
if (infoStream != null)
infoStream.println("IndexFileDeleter: unable to remove file \"" + file + "\": " + e.toString() + "; Will re-try later.");
addDeletableFile(file); // add to deletable
}
}
}
final void clearPendingFiles() {
pending = null;
}
final void addPendingFile(String fileName) {
if (pending == null) {
pending = new Vector();
}
pending.addElement(fileName);
}
final void commitPendingFiles() {
if (pending != null) {
if (deletable == null) {
deletable = pending;
pending = null;
} else {
deletable.addAll(pending);
pending = null;
}
}
}
public final void addDeletableFile(String fileName) {
if (deletable == null) {
deletable = new Vector();
}
deletable.addElement(fileName);
}
public final void deleteFiles()
throws IOException {
if (deletable != null) {
Vector oldDeletable = deletable;
deletable = null;
deleteFiles(oldDeletable); // try to delete deletable
}
}
}

View File

@ -19,6 +19,7 @@ package org.apache.lucene.index;
import java.io.File; import java.io.File;
import java.io.FilenameFilter; import java.io.FilenameFilter;
import java.util.HashSet;
/** /**
* Filename filter that accept filenames and extensions only created by Lucene. * Filename filter that accept filenames and extensions only created by Lucene.
@ -28,18 +29,64 @@ import java.io.FilenameFilter;
*/ */
public class IndexFileNameFilter implements FilenameFilter { public class IndexFileNameFilter implements FilenameFilter {
static IndexFileNameFilter singleton = new IndexFileNameFilter();
private HashSet extensions;
public IndexFileNameFilter() {
extensions = new HashSet();
for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) {
extensions.add(IndexFileNames.INDEX_EXTENSIONS[i]);
}
}
/* (non-Javadoc) /* (non-Javadoc)
* @see java.io.FilenameFilter#accept(java.io.File, java.lang.String) * @see java.io.FilenameFilter#accept(java.io.File, java.lang.String)
*/ */
public boolean accept(File dir, String name) { public boolean accept(File dir, String name) {
for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) { int i = name.lastIndexOf('.');
if (name.endsWith("."+IndexFileNames.INDEX_EXTENSIONS[i])) if (i != -1) {
String extension = name.substring(1+i);
if (extensions.contains(extension)) {
return true;
} else if (extension.startsWith("f") &&
extension.matches("f\\d+")) {
return true;
} else if (extension.startsWith("s") &&
extension.matches("s\\d+")) {
return true; return true;
} }
} else {
if (name.equals(IndexFileNames.DELETABLE)) return true; if (name.equals(IndexFileNames.DELETABLE)) return true;
else if (name.equals(IndexFileNames.SEGMENTS)) return true; else if (name.startsWith(IndexFileNames.SEGMENTS)) return true;
else if (name.matches(".+\\.f\\d+")) return true; }
return false; return false;
} }
/**
* Returns true if this is a file that would be contained
* in a CFS file. This function should only be called on
* files that pass the above "accept" (ie, are already
* known to be a Lucene index file).
*/
public boolean isCFSFile(String name) {
int i = name.lastIndexOf('.');
if (i != -1) {
String extension = name.substring(1+i);
if (extensions.contains(extension) &&
!extension.equals("del") &&
!extension.equals("gen") &&
!extension.equals("cfs")) {
return true;
}
if (extension.startsWith("f") &&
extension.matches("f\\d+")) {
return true;
}
}
return false;
}
public static IndexFileNameFilter getFilter() {
return singleton;
}
} }

View File

@ -28,18 +28,24 @@ final class IndexFileNames {
/** Name of the index segment file */ /** Name of the index segment file */
static final String SEGMENTS = "segments"; static final String SEGMENTS = "segments";
/** Name of the index deletable file */ /** Name of the generation reference file name */
static final String SEGMENTS_GEN = "segments.gen";
/** Name of the index deletable file (only used in
* pre-lockless indices) */
static final String DELETABLE = "deletable"; static final String DELETABLE = "deletable";
/** /**
* This array contains all filename extensions used by Lucene's index files, with * This array contains all filename extensions used by
* one exception, namely the extension made up from <code>.f</code> + a number. * Lucene's index files, with two exceptions, namely the
* Also note that two of Lucene's files (<code>deletable</code> and * extension made up from <code>.f</code> + a number and
* <code>segments</code>) don't have any filename extension. * from <code>.s</code> + a number. Also note that
* Lucene's <code>segments_N</code> files do not have any
* filename extension.
*/ */
static final String INDEX_EXTENSIONS[] = new String[] { static final String INDEX_EXTENSIONS[] = new String[] {
"cfs", "fnm", "fdx", "fdt", "tii", "tis", "frq", "prx", "del", "cfs", "fnm", "fdx", "fdt", "tii", "tis", "frq", "prx", "del",
"tvx", "tvd", "tvf", "tvp" }; "tvx", "tvd", "tvf", "tvp", "gen"};
/** File extensions of old-style index files */ /** File extensions of old-style index files */
static final String COMPOUND_EXTENSIONS[] = new String[] { static final String COMPOUND_EXTENSIONS[] = new String[] {
@ -51,4 +57,23 @@ final class IndexFileNames {
"tvx", "tvd", "tvf" "tvx", "tvd", "tvf"
}; };
/**
* Computes the full file name from base, extension and
* generation. If the generation is -1, the file name is
* null. If it's 0, the file name is <base><extension>.
* If it's > 0, the file name is <base>_<generation><extension>.
*
* @param base -- main part of the file name
* @param extension -- extension of the filename (including .)
* @param gen -- generation
*/
public static final String fileNameFromGeneration(String base, String extension, long gen) {
if (gen == -1) {
return null;
} else if (gen == 0) {
return base + extension;
} else {
return base + "_" + Long.toString(gen, Character.MAX_RADIX) + extension;
}
}
} }

View File

@ -113,6 +113,7 @@ public abstract class IndexReader {
private Directory directory; private Directory directory;
private boolean directoryOwner; private boolean directoryOwner;
private boolean closeDirectory; private boolean closeDirectory;
protected IndexFileDeleter deleter;
private SegmentInfos segmentInfos; private SegmentInfos segmentInfos;
private Lock writeLock; private Lock writeLock;
@ -138,25 +139,41 @@ public abstract class IndexReader {
} }
private static IndexReader open(final Directory directory, final boolean closeDirectory) throws IOException { private static IndexReader open(final Directory directory, final boolean closeDirectory) throws IOException {
synchronized (directory) { // in- & inter-process sync
return (IndexReader)new Lock.With( return (IndexReader) new SegmentInfos.FindSegmentsFile(directory) {
directory.makeLock(IndexWriter.COMMIT_LOCK_NAME),
IndexWriter.COMMIT_LOCK_TIMEOUT) { public Object doBody(String segmentFileName) throws IOException {
public Object doBody() throws IOException {
SegmentInfos infos = new SegmentInfos(); SegmentInfos infos = new SegmentInfos();
infos.read(directory); infos.read(directory, segmentFileName);
if (infos.size() == 1) { // index is optimized if (infos.size() == 1) { // index is optimized
return SegmentReader.get(infos, infos.info(0), closeDirectory); return SegmentReader.get(infos, infos.info(0), closeDirectory);
} } else {
IndexReader[] readers = new IndexReader[infos.size()];
for (int i = 0; i < infos.size(); i++)
readers[i] = SegmentReader.get(infos.info(i));
return new MultiReader(directory, infos, closeDirectory, readers);
// To reduce the chance of hitting FileNotFound
// (and having to retry), we open segments in
// reverse because IndexWriter merges & deletes
// the newest segments first.
IndexReader[] readers = new IndexReader[infos.size()];
for (int i = infos.size()-1; i >= 0; i--) {
try {
readers[i] = SegmentReader.get(infos.info(i));
} catch (IOException e) {
// Close all readers we had opened:
for(i++;i<infos.size();i++) {
readers[i].close();
}
throw e;
}
}
return new MultiReader(directory, infos, closeDirectory, readers);
}
} }
}.run(); }.run();
} }
}
/** Returns the directory this index resides in. */ /** Returns the directory this index resides in. */
public Directory directory() { return directory; } public Directory directory() { return directory; }
@ -175,8 +192,12 @@ public abstract class IndexReader {
* Do not use this to check whether the reader is still up-to-date, use * Do not use this to check whether the reader is still up-to-date, use
* {@link #isCurrent()} instead. * {@link #isCurrent()} instead.
*/ */
public static long lastModified(File directory) throws IOException { public static long lastModified(File fileDirectory) throws IOException {
return FSDirectory.fileModified(directory, IndexFileNames.SEGMENTS); return ((Long) new SegmentInfos.FindSegmentsFile(fileDirectory) {
public Object doBody(String segmentFileName) {
return new Long(FSDirectory.fileModified(fileDirectory, segmentFileName));
}
}.run()).longValue();
} }
/** /**
@ -184,8 +205,12 @@ public abstract class IndexReader {
* Do not use this to check whether the reader is still up-to-date, use * Do not use this to check whether the reader is still up-to-date, use
* {@link #isCurrent()} instead. * {@link #isCurrent()} instead.
*/ */
public static long lastModified(Directory directory) throws IOException { public static long lastModified(final Directory directory2) throws IOException {
return directory.fileModified(IndexFileNames.SEGMENTS); return ((Long) new SegmentInfos.FindSegmentsFile(directory2) {
public Object doBody(String segmentFileName) throws IOException {
return new Long(directory2.fileModified(segmentFileName));
}
}.run()).longValue();
} }
/** /**
@ -227,21 +252,7 @@ public abstract class IndexReader {
* @throws IOException if segments file cannot be read. * @throws IOException if segments file cannot be read.
*/ */
public static long getCurrentVersion(Directory directory) throws IOException { public static long getCurrentVersion(Directory directory) throws IOException {
synchronized (directory) { // in- & inter-process sync
Lock commitLock=directory.makeLock(IndexWriter.COMMIT_LOCK_NAME);
boolean locked=false;
try {
locked=commitLock.obtain(IndexWriter.COMMIT_LOCK_TIMEOUT);
return SegmentInfos.readCurrentVersion(directory); return SegmentInfos.readCurrentVersion(directory);
} finally {
if (locked) {
commitLock.release();
}
}
}
} }
/** /**
@ -259,21 +270,7 @@ public abstract class IndexReader {
* @throws IOException * @throws IOException
*/ */
public boolean isCurrent() throws IOException { public boolean isCurrent() throws IOException {
synchronized (directory) { // in- & inter-process sync
Lock commitLock=directory.makeLock(IndexWriter.COMMIT_LOCK_NAME);
boolean locked=false;
try {
locked=commitLock.obtain(IndexWriter.COMMIT_LOCK_TIMEOUT);
return SegmentInfos.readCurrentVersion(directory) == segmentInfos.getVersion(); return SegmentInfos.readCurrentVersion(directory) == segmentInfos.getVersion();
} finally {
if (locked) {
commitLock.release();
}
}
}
} }
/** /**
@ -319,7 +316,7 @@ public abstract class IndexReader {
* @return <code>true</code> if an index exists; <code>false</code> otherwise * @return <code>true</code> if an index exists; <code>false</code> otherwise
*/ */
public static boolean indexExists(String directory) { public static boolean indexExists(String directory) {
return (new File(directory, IndexFileNames.SEGMENTS)).exists(); return indexExists(new File(directory));
} }
/** /**
@ -328,8 +325,9 @@ public abstract class IndexReader {
* @param directory the directory to check for an index * @param directory the directory to check for an index
* @return <code>true</code> if an index exists; <code>false</code> otherwise * @return <code>true</code> if an index exists; <code>false</code> otherwise
*/ */
public static boolean indexExists(File directory) { public static boolean indexExists(File directory) {
return (new File(directory, IndexFileNames.SEGMENTS)).exists(); return SegmentInfos.getCurrentSegmentGeneration(directory.list()) != -1;
} }
/** /**
@ -340,7 +338,7 @@ public abstract class IndexReader {
* @throws IOException if there is a problem with accessing the index * @throws IOException if there is a problem with accessing the index
*/ */
public static boolean indexExists(Directory directory) throws IOException { public static boolean indexExists(Directory directory) throws IOException {
return directory.fileExists(IndexFileNames.SEGMENTS); return SegmentInfos.getCurrentSegmentGeneration(directory) != -1;
} }
/** Returns the number of documents in this index. */ /** Returns the number of documents in this index. */
@ -592,17 +590,22 @@ public abstract class IndexReader {
*/ */
protected final synchronized void commit() throws IOException{ protected final synchronized void commit() throws IOException{
if(hasChanges){ if(hasChanges){
if (deleter == null) {
// In the MultiReader case, we share this deleter
// across all SegmentReaders:
setDeleter(new IndexFileDeleter(segmentInfos, directory));
deleter.deleteFiles();
}
if(directoryOwner){ if(directoryOwner){
synchronized (directory) { // in- & inter-process sync deleter.clearPendingFiles();
new Lock.With(directory.makeLock(IndexWriter.COMMIT_LOCK_NAME),
IndexWriter.COMMIT_LOCK_TIMEOUT) {
public Object doBody() throws IOException {
doCommit(); doCommit();
String oldInfoFileName = segmentInfos.getCurrentSegmentFileName();
segmentInfos.write(directory); segmentInfos.write(directory);
return null; // Attempt to delete all files we just obsoleted:
}
}.run(); deleter.deleteFile(oldInfoFileName);
} deleter.commitPendingFiles();
deleter.deleteFiles();
if (writeLock != null) { if (writeLock != null) {
writeLock.release(); // release write lock writeLock.release(); // release write lock
writeLock = null; writeLock = null;
@ -614,6 +617,13 @@ public abstract class IndexReader {
hasChanges = false; hasChanges = false;
} }
protected void setDeleter(IndexFileDeleter deleter) {
this.deleter = deleter;
}
protected IndexFileDeleter getDeleter() {
return deleter;
}
/** Implements commit. */ /** Implements commit. */
protected abstract void doCommit() throws IOException; protected abstract void doCommit() throws IOException;
@ -658,8 +668,7 @@ public abstract class IndexReader {
*/ */
public static boolean isLocked(Directory directory) throws IOException { public static boolean isLocked(Directory directory) throws IOException {
return return
directory.makeLock(IndexWriter.WRITE_LOCK_NAME).isLocked() || directory.makeLock(IndexWriter.WRITE_LOCK_NAME).isLocked();
directory.makeLock(IndexWriter.COMMIT_LOCK_NAME).isLocked();
} }
/** /**
@ -684,7 +693,6 @@ public abstract class IndexReader {
*/ */
public static void unlock(Directory directory) throws IOException { public static void unlock(Directory directory) throws IOException {
directory.makeLock(IndexWriter.WRITE_LOCK_NAME).release(); directory.makeLock(IndexWriter.WRITE_LOCK_NAME).release();
directory.makeLock(IndexWriter.COMMIT_LOCK_NAME).release();
} }
/** /**

View File

@ -67,16 +67,7 @@ public class IndexWriter {
private long writeLockTimeout = WRITE_LOCK_TIMEOUT; private long writeLockTimeout = WRITE_LOCK_TIMEOUT;
/**
* Default value for the commit lock timeout (10,000).
* @see #setDefaultCommitLockTimeout
*/
public static long COMMIT_LOCK_TIMEOUT = 10000;
private long commitLockTimeout = COMMIT_LOCK_TIMEOUT;
public static final String WRITE_LOCK_NAME = "write.lock"; public static final String WRITE_LOCK_NAME = "write.lock";
public static final String COMMIT_LOCK_NAME = "commit.lock";
/** /**
* Default value is 10. Change using {@link #setMergeFactor(int)}. * Default value is 10. Change using {@link #setMergeFactor(int)}.
@ -111,6 +102,7 @@ public class IndexWriter {
private SegmentInfos segmentInfos = new SegmentInfos(); // the segments private SegmentInfos segmentInfos = new SegmentInfos(); // the segments
private SegmentInfos ramSegmentInfos = new SegmentInfos(); // the segments in ramDirectory private SegmentInfos ramSegmentInfos = new SegmentInfos(); // the segments in ramDirectory
private final Directory ramDirectory = new RAMDirectory(); // for temp segs private final Directory ramDirectory = new RAMDirectory(); // for temp segs
private IndexFileDeleter deleter;
private Lock writeLock; private Lock writeLock;
@ -260,19 +252,30 @@ public class IndexWriter {
this.writeLock = writeLock; // save it this.writeLock = writeLock; // save it
try { try {
synchronized (directory) { // in- & inter-process sync if (create) {
new Lock.With(directory.makeLock(IndexWriter.COMMIT_LOCK_NAME), commitLockTimeout) { // Try to read first. This is to allow create
public Object doBody() throws IOException { // against an index that's currently open for
if (create) // searching. In this case we write the next
segmentInfos.write(directory); // segments_N file with no segments:
else try {
segmentInfos.read(directory); segmentInfos.read(directory);
return null; segmentInfos.clear();
} } catch (IOException e) {
}.run(); // Likely this means it's a fresh directory
} }
segmentInfos.write(directory);
} else {
segmentInfos.read(directory);
}
// Create a deleter to keep track of which files can
// be deleted:
deleter = new IndexFileDeleter(segmentInfos, directory);
deleter.setInfoStream(infoStream);
deleter.findDeletableFiles();
deleter.deleteFiles();
} catch (IOException e) { } catch (IOException e) {
// the doBody method failed
this.writeLock.release(); this.writeLock.release();
this.writeLock = null; this.writeLock = null;
throw e; throw e;
@ -380,35 +383,6 @@ public class IndexWriter {
return infoStream; return infoStream;
} }
/**
* Sets the maximum time to wait for a commit lock (in milliseconds) for this instance of IndexWriter. @see
* @see #setDefaultCommitLockTimeout to change the default value for all instances of IndexWriter.
*/
public void setCommitLockTimeout(long commitLockTimeout) {
this.commitLockTimeout = commitLockTimeout;
}
/**
* @see #setCommitLockTimeout
*/
public long getCommitLockTimeout() {
return commitLockTimeout;
}
/**
* Sets the default (for any instance of IndexWriter) maximum time to wait for a commit lock (in milliseconds)
*/
public static void setDefaultCommitLockTimeout(long commitLockTimeout) {
IndexWriter.COMMIT_LOCK_TIMEOUT = commitLockTimeout;
}
/**
* @see #setDefaultCommitLockTimeout
*/
public static long getDefaultCommitLockTimeout() {
return IndexWriter.COMMIT_LOCK_TIMEOUT;
}
/** /**
* Sets the maximum time to wait for a write lock (in milliseconds) for this instance of IndexWriter. @see * Sets the maximum time to wait for a write lock (in milliseconds) for this instance of IndexWriter. @see
* @see #setDefaultWriteLockTimeout to change the default value for all instances of IndexWriter. * @see #setDefaultWriteLockTimeout to change the default value for all instances of IndexWriter.
@ -517,7 +491,7 @@ public class IndexWriter {
String segmentName = newRAMSegmentName(); String segmentName = newRAMSegmentName();
dw.addDocument(segmentName, doc); dw.addDocument(segmentName, doc);
synchronized (this) { synchronized (this) {
ramSegmentInfos.addElement(new SegmentInfo(segmentName, 1, ramDirectory)); ramSegmentInfos.addElement(new SegmentInfo(segmentName, 1, ramDirectory, false));
maybeFlushRamSegments(); maybeFlushRamSegments();
} }
} }
@ -790,36 +764,26 @@ public class IndexWriter {
int docCount = merger.merge(); // merge 'em int docCount = merger.merge(); // merge 'em
segmentInfos.setSize(0); // pop old infos & add new segmentInfos.setSize(0); // pop old infos & add new
segmentInfos.addElement(new SegmentInfo(mergedName, docCount, directory)); SegmentInfo info = new SegmentInfo(mergedName, docCount, directory, false);
segmentInfos.addElement(info);
if(sReader != null) if(sReader != null)
sReader.close(); sReader.close();
synchronized (directory) { // in- & inter-process sync String segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
public Object doBody() throws IOException {
segmentInfos.write(directory); // commit changes segmentInfos.write(directory); // commit changes
return null;
}
}.run();
}
deleteSegments(segmentsToDelete); // delete now-unused segments deleter.deleteFile(segmentsInfosFileName); // delete old segments_N file
deleter.deleteSegments(segmentsToDelete); // delete now-unused segments
if (useCompoundFile) { if (useCompoundFile) {
final Vector filesToDelete = merger.createCompoundFile(mergedName + ".tmp"); Vector filesToDelete = merger.createCompoundFile(mergedName + ".cfs");
synchronized (directory) { // in- & inter-process sync segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) { info.setUseCompoundFile(true);
public Object doBody() throws IOException { segmentInfos.write(directory); // commit again so readers know we've switched this segment to a compound file
// make compound file visible for SegmentReaders
directory.renameFile(mergedName + ".tmp", mergedName + ".cfs");
return null;
}
}.run();
}
// delete now unused files of segment deleter.deleteFile(segmentsInfosFileName); // delete old segments_N file
deleteFiles(filesToDelete); deleter.deleteFiles(filesToDelete); // delete now unused files of segment
} }
} }
@ -937,6 +901,7 @@ public class IndexWriter {
*/ */
private final int mergeSegments(SegmentInfos sourceSegments, int minSegment, int end) private final int mergeSegments(SegmentInfos sourceSegments, int minSegment, int end)
throws IOException { throws IOException {
final String mergedName = newSegmentName(); final String mergedName = newSegmentName();
if (infoStream != null) infoStream.print("merging segments"); if (infoStream != null) infoStream.print("merging segments");
SegmentMerger merger = new SegmentMerger(this, mergedName); SegmentMerger merger = new SegmentMerger(this, mergedName);
@ -960,7 +925,7 @@ public class IndexWriter {
} }
SegmentInfo newSegment = new SegmentInfo(mergedName, mergedDocCount, SegmentInfo newSegment = new SegmentInfo(mergedName, mergedDocCount,
directory); directory, false);
if (sourceSegments == ramSegmentInfos) { if (sourceSegments == ramSegmentInfos) {
sourceSegments.removeAllElements(); sourceSegments.removeAllElements();
segmentInfos.addElement(newSegment); segmentInfos.addElement(newSegment);
@ -973,115 +938,26 @@ public class IndexWriter {
// close readers before we attempt to delete now-obsolete segments // close readers before we attempt to delete now-obsolete segments
merger.closeReaders(); merger.closeReaders();
synchronized (directory) { // in- & inter-process sync String segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
public Object doBody() throws IOException {
segmentInfos.write(directory); // commit before deleting segmentInfos.write(directory); // commit before deleting
return null;
}
}.run();
}
deleteSegments(segmentsToDelete); // delete now-unused segments deleter.deleteFile(segmentsInfosFileName); // delete old segments_N file
deleter.deleteSegments(segmentsToDelete); // delete now-unused segments
if (useCompoundFile) { if (useCompoundFile) {
final Vector filesToDelete = merger.createCompoundFile(mergedName + ".tmp"); Vector filesToDelete = merger.createCompoundFile(mergedName + ".cfs");
synchronized (directory) { // in- & inter-process sync
new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
public Object doBody() throws IOException {
// make compound file visible for SegmentReaders
directory.renameFile(mergedName + ".tmp", mergedName + ".cfs");
return null;
}
}.run();
}
// delete now unused files of segment segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
deleteFiles(filesToDelete); newSegment.setUseCompoundFile(true);
segmentInfos.write(directory); // commit again so readers know we've switched this segment to a compound file
deleter.deleteFile(segmentsInfosFileName); // delete old segments_N file
deleter.deleteFiles(filesToDelete); // delete now-unused segments
} }
return mergedDocCount; return mergedDocCount;
} }
/*
* Some operating systems (e.g. Windows) don't permit a file to be deleted
* while it is opened for read (e.g. by another process or thread). So we
* assume that when a delete fails it is because the file is open in another
* process, and queue the file for subsequent deletion.
*/
private final void deleteSegments(Vector segments) throws IOException {
Vector deletable = new Vector();
deleteFiles(readDeleteableFiles(), deletable); // try to delete deleteable
for (int i = 0; i < segments.size(); i++) {
SegmentReader reader = (SegmentReader)segments.elementAt(i);
if (reader.directory() == this.directory)
deleteFiles(reader.files(), deletable); // try to delete our files
else
deleteFiles(reader.files(), reader.directory()); // delete other files
}
writeDeleteableFiles(deletable); // note files we can't delete
}
private final void deleteFiles(Vector files) throws IOException {
Vector deletable = new Vector();
deleteFiles(readDeleteableFiles(), deletable); // try to delete deleteable
deleteFiles(files, deletable); // try to delete our files
writeDeleteableFiles(deletable); // note files we can't delete
}
private final void deleteFiles(Vector files, Directory directory)
throws IOException {
for (int i = 0; i < files.size(); i++)
directory.deleteFile((String)files.elementAt(i));
}
private final void deleteFiles(Vector files, Vector deletable)
throws IOException {
for (int i = 0; i < files.size(); i++) {
String file = (String)files.elementAt(i);
try {
directory.deleteFile(file); // try to delete each file
} catch (IOException e) { // if delete fails
if (directory.fileExists(file)) {
if (infoStream != null)
infoStream.println(e.toString() + "; Will re-try later.");
deletable.addElement(file); // add to deletable
}
}
}
}
private final Vector readDeleteableFiles() throws IOException {
Vector result = new Vector();
if (!directory.fileExists(IndexFileNames.DELETABLE))
return result;
IndexInput input = directory.openInput(IndexFileNames.DELETABLE);
try {
for (int i = input.readInt(); i > 0; i--) // read file names
result.addElement(input.readString());
} finally {
input.close();
}
return result;
}
private final void writeDeleteableFiles(Vector files) throws IOException {
IndexOutput output = directory.createOutput("deleteable.new");
try {
output.writeInt(files.size());
for (int i = 0; i < files.size(); i++)
output.writeString((String)files.elementAt(i));
} finally {
output.close();
}
directory.renameFile("deleteable.new", IndexFileNames.DELETABLE);
}
private final boolean checkNonDecreasingLevels(int start) { private final boolean checkNonDecreasingLevels(int start) {
int lowerBound = -1; int lowerBound = -1;
int upperBound = minMergeDocs; int upperBound = minMergeDocs;

View File

@ -218,6 +218,13 @@ public class MultiReader extends IndexReader {
return new MultiTermPositions(subReaders, starts); return new MultiTermPositions(subReaders, starts);
} }
protected void setDeleter(IndexFileDeleter deleter) {
// Share deleter to our SegmentReaders:
this.deleter = deleter;
for (int i = 0; i < subReaders.length; i++)
subReaders[i].setDeleter(deleter);
}
protected void doCommit() throws IOException { protected void doCommit() throws IOException {
for (int i = 0; i < subReaders.length; i++) for (int i = 0; i < subReaders.length; i++)
subReaders[i].commit(); subReaders[i].commit();

View File

@ -18,15 +18,302 @@ package org.apache.lucene.index;
*/ */
import org.apache.lucene.store.Directory; import org.apache.lucene.store.Directory;
import org.apache.lucene.store.IndexOutput;
import org.apache.lucene.store.IndexInput;
import java.io.IOException;
final class SegmentInfo { final class SegmentInfo {
public String name; // unique name in dir public String name; // unique name in dir
public int docCount; // number of docs in seg public int docCount; // number of docs in seg
public Directory dir; // where segment resides public Directory dir; // where segment resides
private boolean preLockless; // true if this is a segments file written before
// lock-less commits (XXX)
private long delGen; // current generation of del file; -1 if there
// are no deletes; 0 if it's a pre-XXX segment
// (and we must check filesystem); 1 or higher if
// there are deletes at generation N
private long[] normGen; // current generations of each field's norm file.
// If this array is null, we must check filesystem
// when preLockLess is true. Else,
// there are no separate norms
private byte isCompoundFile; // -1 if it is not; 1 if it is; 0 if it's
// pre-XXX (ie, must check file system to see
// if <name>.cfs exists)
public SegmentInfo(String name, int docCount, Directory dir) { public SegmentInfo(String name, int docCount, Directory dir) {
this.name = name; this.name = name;
this.docCount = docCount; this.docCount = docCount;
this.dir = dir; this.dir = dir;
delGen = -1;
isCompoundFile = 0;
preLockless = true;
}
public SegmentInfo(String name, int docCount, Directory dir, boolean isCompoundFile) {
this(name, docCount, dir);
if (isCompoundFile) {
this.isCompoundFile = 1;
} else {
this.isCompoundFile = -1;
}
preLockless = false;
}
/**
* Construct a new SegmentInfo instance by reading a
* previously saved SegmentInfo from input.
*
* @param dir directory to load from
* @param format format of the segments info file
* @param input input handle to read segment info from
*/
public SegmentInfo(Directory dir, int format, IndexInput input) throws IOException {
this.dir = dir;
name = input.readString();
docCount = input.readInt();
if (format <= SegmentInfos.FORMAT_LOCKLESS) {
delGen = input.readLong();
int numNormGen = input.readInt();
if (numNormGen == -1) {
normGen = null;
} else {
normGen = new long[numNormGen];
for(int j=0;j<numNormGen;j++) {
normGen[j] = input.readLong();
}
}
isCompoundFile = input.readByte();
preLockless = isCompoundFile == 0;
} else {
delGen = 0;
normGen = null;
isCompoundFile = 0;
preLockless = true;
}
}
void setNumField(int numField) {
if (normGen == null) {
// normGen is null if we loaded a pre-XXX segment
// file, or, if this segments file hasn't had any
// norms set against it yet:
normGen = new long[numField];
if (!preLockless) {
// This is a FORMAT_LOCKLESS segment, which means
// there are no norms:
for(int i=0;i<numField;i++) {
normGen[i] = -1;
}
}
}
}
boolean hasDeletions()
throws IOException {
// Cases:
//
// delGen == -1: this means this segment was written
// by the LOCKLESS code and for certain does not have
// deletions yet
//
// delGen == 0: this means this segment was written by
// pre-LOCKLESS code which means we must check
// directory to see if .del file exists
//
// delGen > 0: this means this segment was written by
// the LOCKLESS code and for certain has
// deletions
//
if (delGen == -1) {
return false;
} else if (delGen > 0) {
return true;
} else {
return dir.fileExists(getDelFileName());
}
}
void advanceDelGen() {
// delGen 0 is reserved for pre-LOCKLESS format
if (delGen == -1) {
delGen = 1;
} else {
delGen++;
}
}
void clearDelGen() {
delGen = -1;
}
String getDelFileName() {
if (delGen == -1) {
// In this case we know there is no deletion filename
// against this segment
return null;
} else {
// If delGen is 0, it's the pre-lockless-commit file format
return IndexFileNames.fileNameFromGeneration(name, ".del", delGen);
}
}
/**
* Returns true if this field for this segment has saved a separate norms file (_<segment>_N.sX).
*
* @param fieldNumber the field index to check
*/
boolean hasSeparateNorms(int fieldNumber)
throws IOException {
if ((normGen == null && preLockless) || (normGen != null && normGen[fieldNumber] == 0)) {
// Must fallback to directory file exists check:
String fileName = name + ".s" + fieldNumber;
return dir.fileExists(fileName);
} else if (normGen == null || normGen[fieldNumber] == -1) {
return false;
} else {
return true;
}
}
/**
* Returns true if any fields in this segment have separate norms.
*/
boolean hasSeparateNorms()
throws IOException {
if (normGen == null) {
if (!preLockless) {
// This means we were created w/ LOCKLESS code and no
// norms are written yet:
return false;
} else {
// This means this segment was saved with pre-LOCKLESS
// code. So we must fallback to the original
// directory list check:
String[] result = dir.list();
String pattern;
pattern = name + ".s";
int patternLength = pattern.length();
for(int i = 0; i < result.length; i++){
if(result[i].startsWith(pattern) && Character.isDigit(result[i].charAt(patternLength)))
return true;
}
return false;
}
} else {
// This means this segment was saved with LOCKLESS
// code so we first check whether any normGen's are >
// 0 (meaning they definitely have separate norms):
for(int i=0;i<normGen.length;i++) {
if (normGen[i] > 0) {
return true;
}
}
// Next we look for any == 0. These cases were
// pre-LOCKLESS and must be checked in directory:
for(int i=0;i<normGen.length;i++) {
if (normGen[i] == 0) {
if (dir.fileExists(getNormFileName(i))) {
return true;
}
}
}
}
return false;
}
/**
* Increment the generation count for the norms file for
* this field.
*
* @param fieldIndex field whose norm file will be rewritten
*/
void advanceNormGen(int fieldIndex) {
if (normGen[fieldIndex] == -1) {
normGen[fieldIndex] = 1;
} else {
normGen[fieldIndex]++;
}
}
/**
* Get the file name for the norms file for this field.
*
* @param number field index
*/
String getNormFileName(int number) throws IOException {
String prefix;
long gen;
if (normGen == null) {
gen = 0;
} else {
gen = normGen[number];
}
if (hasSeparateNorms(number)) {
prefix = ".s";
return IndexFileNames.fileNameFromGeneration(name, prefix + number, gen);
} else {
prefix = ".f";
return IndexFileNames.fileNameFromGeneration(name, prefix + number, 0);
}
}
/**
* Mark whether this segment is stored as a compound file.
*
* @param isCompoundFile true if this is a compound file;
* else, false
*/
void setUseCompoundFile(boolean isCompoundFile) {
if (isCompoundFile) {
this.isCompoundFile = 1;
} else {
this.isCompoundFile = -1;
}
}
/**
* Returns true if this segment is stored as a compound
* file; else, false.
*
* @param directory directory to check. This parameter is
* only used when the segment was written before version
* XXX (at which point compound file or not became stored
* in the segments info file).
*/
boolean getUseCompoundFile() throws IOException {
if (isCompoundFile == -1) {
return false;
} else if (isCompoundFile == 1) {
return true;
} else {
return dir.fileExists(name + ".cfs");
}
}
/**
* Save this segment's info.
*/
void write(IndexOutput output)
throws IOException {
output.writeString(name);
output.writeInt(docCount);
output.writeLong(delGen);
if (normGen == null) {
output.writeInt(-1);
} else {
output.writeInt(normGen.length);
for(int j=0;j<normGen.length;j++) {
output.writeLong(normGen[j]);
}
}
output.writeByte(isCompoundFile);
} }
} }

View File

@ -19,36 +19,151 @@ package org.apache.lucene.index;
import java.util.Vector; import java.util.Vector;
import java.io.IOException; import java.io.IOException;
import java.io.PrintStream;
import java.io.File;
import java.io.FileNotFoundException;
import org.apache.lucene.store.Directory; import org.apache.lucene.store.Directory;
import org.apache.lucene.store.IndexInput; import org.apache.lucene.store.IndexInput;
import org.apache.lucene.store.IndexOutput; import org.apache.lucene.store.IndexOutput;
import org.apache.lucene.util.Constants; import org.apache.lucene.util.Constants;
final class SegmentInfos extends Vector { public final class SegmentInfos extends Vector {
/** The file format version, a negative number. */ /** The file format version, a negative number. */
/* Works since counter, the old 1st entry, is always >= 0 */ /* Works since counter, the old 1st entry, is always >= 0 */
public static final int FORMAT = -1; public static final int FORMAT = -1;
/** This is the current file format written. It differs
* slightly from the previous format in that file names
* are never re-used (write once). Instead, each file is
* written to the next generation. For example,
* segments_1, segments_2, etc. This allows us to not use
* a commit lock. See <a
* href="http://lucene.apache.org/java/docs/fileformats.html">file
* formats</a> for details.
*/
public static final int FORMAT_LOCKLESS = -2;
public int counter = 0; // used to name new segments public int counter = 0; // used to name new segments
/** /**
* counts how often the index has been changed by adding or deleting docs. * counts how often the index has been changed by adding or deleting docs.
* starting with the current time in milliseconds forces to create unique version numbers. * starting with the current time in milliseconds forces to create unique version numbers.
*/ */
private long version = System.currentTimeMillis(); private long version = System.currentTimeMillis();
private long generation = 0; // generation of the "segments_N" file we read
/**
* If non-null, information about loading segments_N files
* will be printed here. @see #setInfoStream.
*/
private static PrintStream infoStream;
public final SegmentInfo info(int i) { public final SegmentInfo info(int i) {
return (SegmentInfo) elementAt(i); return (SegmentInfo) elementAt(i);
} }
public final void read(Directory directory) throws IOException { /**
* Get the generation (N) of the current segments_N file
* from a list of files.
*
* @param files -- array of file names to check
*/
public static long getCurrentSegmentGeneration(String[] files) {
if (files == null) {
return -1;
}
long max = -1;
int prefixLen = IndexFileNames.SEGMENTS.length()+1;
for (int i = 0; i < files.length; i++) {
String file = files[i];
if (file.startsWith(IndexFileNames.SEGMENTS) && !file.equals(IndexFileNames.SEGMENTS_GEN)) {
if (file.equals(IndexFileNames.SEGMENTS)) {
// Pre lock-less commits:
if (max == -1) {
max = 0;
}
} else {
long v = Long.parseLong(file.substring(prefixLen), Character.MAX_RADIX);
if (v > max) {
max = v;
}
}
}
}
return max;
}
/**
* Get the generation (N) of the current segments_N file
* in the directory.
*
* @param directory -- directory to search for the latest segments_N file
*/
public static long getCurrentSegmentGeneration(Directory directory) throws IOException {
String[] files = directory.list();
if (files == null)
throw new IOException("Cannot read directory " + directory);
return getCurrentSegmentGeneration(files);
}
/**
* Get the filename of the current segments_N file
* from a list of files.
*
* @param files -- array of file names to check
*/
public static String getCurrentSegmentFileName(String[] files) throws IOException {
return IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
"",
getCurrentSegmentGeneration(files));
}
/**
* Get the filename of the current segments_N file
* in the directory.
*
* @param directory -- directory to search for the latest segments_N file
*/
public static String getCurrentSegmentFileName(Directory directory) throws IOException {
return IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
"",
getCurrentSegmentGeneration(directory));
}
/**
* Get the segment_N filename in use by this segment infos.
*/
public String getCurrentSegmentFileName() {
return IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
"",
generation);
}
/**
* Read a particular segmentFileName. Note that this may
* throw an IOException if a commit is in process.
*
* @param directory -- directory containing the segments file
* @param segmentFileName -- segment file to load
*/
public final void read(Directory directory, String segmentFileName) throws IOException {
boolean success = false;
IndexInput input = directory.openInput(segmentFileName);
if (segmentFileName.equals(IndexFileNames.SEGMENTS)) {
generation = 0;
} else {
generation = Long.parseLong(segmentFileName.substring(1+IndexFileNames.SEGMENTS.length()),
Character.MAX_RADIX);
}
IndexInput input = directory.openInput(IndexFileNames.SEGMENTS);
try { try {
int format = input.readInt(); int format = input.readInt();
if(format < 0){ // file contains explicit format info if(format < 0){ // file contains explicit format info
// check that it is a format we can understand // check that it is a format we can understand
if (format < FORMAT) if (format < FORMAT_LOCKLESS)
throw new IOException("Unknown format version: " + format); throw new IOException("Unknown format version: " + format);
version = input.readLong(); // read version version = input.readLong(); // read version
counter = input.readInt(); // read counter counter = input.readInt(); // read counter
@ -58,9 +173,7 @@ final class SegmentInfos extends Vector {
} }
for (int i = input.readInt(); i > 0; i--) { // read segmentInfos for (int i = input.readInt(); i > 0; i--) { // read segmentInfos
SegmentInfo si = addElement(new SegmentInfo(directory, format, input));
new SegmentInfo(input.readString(), input.readInt(), directory);
addElement(si);
} }
if(format >= 0){ // in old format the version number may be at the end of the file if(format >= 0){ // in old format the version number may be at the end of the file
@ -69,31 +182,71 @@ final class SegmentInfos extends Vector {
else else
version = input.readLong(); // read version version = input.readLong(); // read version
} }
success = true;
} }
finally { finally {
input.close(); input.close();
if (!success) {
// Clear any segment infos we had loaded so we
// have a clean slate on retry:
clear();
} }
} }
}
/**
* This version of read uses the retry logic (for lock-less
* commits) to find the right segments file to load.
*/
public final void read(Directory directory) throws IOException {
generation = -1;
new FindSegmentsFile(directory) {
public Object doBody(String segmentFileName) throws IOException {
read(directory, segmentFileName);
return null;
}
}.run();
}
public final void write(Directory directory) throws IOException { public final void write(Directory directory) throws IOException {
IndexOutput output = directory.createOutput("segments.new");
// Always advance the generation on write:
if (generation == -1) {
generation = 1;
} else {
generation++;
}
String segmentFileName = getCurrentSegmentFileName();
IndexOutput output = directory.createOutput(segmentFileName);
try { try {
output.writeInt(FORMAT); // write FORMAT output.writeInt(FORMAT_LOCKLESS); // write FORMAT
output.writeLong(++version); // every write changes the index output.writeLong(++version); // every write changes
// the index
output.writeInt(counter); // write counter output.writeInt(counter); // write counter
output.writeInt(size()); // write infos output.writeInt(size()); // write infos
for (int i = 0; i < size(); i++) { for (int i = 0; i < size(); i++) {
SegmentInfo si = info(i); SegmentInfo si = info(i);
output.writeString(si.name); si.write(output);
output.writeInt(si.docCount);
} }
} }
finally { finally {
output.close(); output.close();
} }
// install new segment info try {
directory.renameFile("segments.new", IndexFileNames.SEGMENTS); output = directory.createOutput(IndexFileNames.SEGMENTS_GEN);
output.writeInt(FORMAT_LOCKLESS);
output.writeLong(generation);
output.writeLong(generation);
output.close();
} catch (IOException e) {
// It's OK if we fail to write this file since it's
// used only as one of the retry fallbacks.
}
} }
/** /**
@ -109,13 +262,17 @@ final class SegmentInfos extends Vector {
public static long readCurrentVersion(Directory directory) public static long readCurrentVersion(Directory directory)
throws IOException { throws IOException {
IndexInput input = directory.openInput(IndexFileNames.SEGMENTS); return ((Long) new FindSegmentsFile(directory) {
public Object doBody(String segmentFileName) throws IOException {
IndexInput input = directory.openInput(segmentFileName);
int format = 0; int format = 0;
long version = 0; long version = 0;
try { try {
format = input.readInt(); format = input.readInt();
if(format < 0){ if(format < 0){
if (format < FORMAT) if (format < FORMAT_LOCKLESS)
throw new IOException("Unknown format version: " + format); throw new IOException("Unknown format version: " + format);
version = input.readLong(); // read version version = input.readLong(); // read version
} }
@ -125,13 +282,301 @@ final class SegmentInfos extends Vector {
} }
if(format < 0) if(format < 0)
return version; return new Long(version);
// We cannot be sure about the format of the file. // We cannot be sure about the format of the file.
// Therefore we have to read the whole file and cannot simply seek to the version entry. // Therefore we have to read the whole file and cannot simply seek to the version entry.
SegmentInfos sis = new SegmentInfos(); SegmentInfos sis = new SegmentInfos();
sis.read(directory); sis.read(directory, segmentFileName);
return sis.getVersion(); return new Long(sis.getVersion());
} }
}.run()).longValue();
}
/** If non-null, information about retries when loading
* the segments file will be printed to this.
*/
public static void setInfoStream(PrintStream infoStream) {
SegmentInfos.infoStream = infoStream;
}
/* Advanced configuration of retry logic in loading
segments_N file */
private static int defaultGenFileRetryCount = 10;
private static int defaultGenFileRetryPauseMsec = 50;
private static int defaultGenLookaheadCount = 10;
/**
* Advanced: set how many times to try loading the
* segments.gen file contents to determine current segment
* generation. This file is only referenced when the
* primary method (listing the directory) fails.
*/
public static void setDefaultGenFileRetryCount(int count) {
defaultGenFileRetryCount = count;
}
/**
* @see #setDefaultGenFileRetryCount
*/
public static int getDefaultGenFileRetryCount() {
return defaultGenFileRetryCount;
}
/**
* Advanced: set how many milliseconds to pause in between
* attempts to load the segments.gen file.
*/
public static void setDefaultGenFileRetryPauseMsec(int msec) {
defaultGenFileRetryPauseMsec = msec;
}
/**
* @see #setDefaultGenFileRetryPauseMsec
*/
public static int getDefaultGenFileRetryPauseMsec() {
return defaultGenFileRetryPauseMsec;
}
/**
* Advanced: set how many times to try incrementing the
* gen when loading the segments file. This only runs if
* the primary (listing directory) and secondary (opening
* segments.gen file) methods fail to find the segments
* file.
*/
public static void setDefaultGenLookaheadCount(int count) {
defaultGenLookaheadCount = count;
}
/**
* @see #setDefaultGenLookaheadCount
*/
public static int getDefaultGenLookahedCount() {
return defaultGenLookaheadCount;
}
/**
* @see #setInfoStream
*/
public static PrintStream getInfoStream() {
return infoStream;
}
private static void message(String message) {
if (infoStream != null) {
infoStream.println(Thread.currentThread().getName() + ": " + message);
}
}
/**
* Utility class for executing code that needs to do
* something with the current segments file. This is
* necessary with lock-less commits because from the time
* you locate the current segments file name, until you
* actually open it, read its contents, or check modified
* time, etc., it could have been deleted due to a writer
* commit finishing.
*/
public abstract static class FindSegmentsFile {
File fileDirectory;
Directory directory;
public FindSegmentsFile(File directory) {
this.fileDirectory = directory;
}
public FindSegmentsFile(Directory directory) {
this.directory = directory;
}
public Object run() throws IOException {
String segmentFileName = null;
long lastGen = -1;
long gen = 0;
int genLookaheadCount = 0;
IOException exc = null;
boolean retry = false;
int method = 0;
// Loop until we succeed in calling doBody() without
// hitting an IOException. An IOException most likely
// means a commit was in process and has finished, in
// the time it took us to load the now-old infos files
// (and segments files). It's also possible it's a
// true error (corrupt index). To distinguish these,
// on each retry we must see "forward progress" on
// which generation we are trying to load. If we
// don't, then the original error is real and we throw
// it.
// We have three methods for determining the current
// generation. We try each in sequence.
while(true) {
// Method 1: list the directory and use the highest
// segments_N file. This method works well as long
// as there is no stale caching on the directory
// contents:
String[] files = null;
if (0 == method) {
if (directory != null) {
files = directory.list();
} else {
files = fileDirectory.list();
}
gen = getCurrentSegmentGeneration(files);
if (gen == -1) {
String s = "";
for(int i=0;i<files.length;i++) {
s += " " + files[i];
}
throw new FileNotFoundException("no segments* file found: files:" + s);
}
}
// Method 2 (fallback if Method 1 isn't reliable):
// if the directory listing seems to be stale, then
// try loading the "segments.gen" file.
if (1 == method || (0 == method && lastGen == gen && retry)) {
method = 1;
for(int i=0;i<defaultGenFileRetryCount;i++) {
IndexInput genInput = null;
try {
genInput = directory.openInput(IndexFileNames.SEGMENTS_GEN);
} catch (IOException e) {
message("segments.gen open: IOException " + e);
}
if (genInput != null) {
try {
int version = genInput.readInt();
if (version == FORMAT_LOCKLESS) {
long gen0 = genInput.readLong();
long gen1 = genInput.readLong();
message("fallback check: " + gen0 + "; " + gen1);
if (gen0 == gen1) {
// The file is consistent.
if (gen0 > gen) {
message("fallback to '" + IndexFileNames.SEGMENTS_GEN + "' check: now try generation " + gen0 + " > " + gen);
gen = gen0;
}
break;
}
}
} catch (IOException err2) {
// will retry
} finally {
genInput.close();
}
}
try {
Thread.sleep(defaultGenFileRetryPauseMsec);
} catch (InterruptedException e) {
// will retry
}
}
}
// Method 3 (fallback if Methods 2 & 3 are not
// reliabel): since both directory cache and file
// contents cache seem to be stale, just advance the
// generation.
if (2 == method || (1 == method && lastGen == gen && retry)) {
method = 2;
if (genLookaheadCount < defaultGenLookaheadCount) {
gen++;
genLookaheadCount++;
message("look ahead incremenent gen to " + gen);
}
}
if (lastGen == gen) {
// This means we're about to try the same
// segments_N last tried. This is allowed,
// exactly once, because writer could have been in
// the process of writing segments_N last time.
if (retry) {
// OK, we've tried the same segments_N file
// twice in a row, so this must be a real
// error. We throw the original exception we
// got.
throw exc;
} else {
retry = true;
}
} else {
// Segment file has advanced since our last loop, so
// reset retry:
retry = false;
}
lastGen = gen;
segmentFileName = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
"",
gen);
try {
Object v = doBody(segmentFileName);
if (exc != null) {
message("success on " + segmentFileName);
}
return v;
} catch (IOException err) {
// Save the original root cause:
if (exc == null) {
exc = err;
}
message("primary Exception on '" + segmentFileName + "': " + err + "'; will retry: retry=" + retry + "; gen = " + gen);
if (!retry && gen > 1) {
// This is our first time trying this segments
// file (because retry is false), and, there is
// possibly a segments_(N-1) (because gen > 1).
// So, check if the segments_(N-1) exists and
// try it if so:
String prevSegmentFileName = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
"",
gen-1);
if (directory.fileExists(prevSegmentFileName)) {
message("fallback to prior segment file '" + prevSegmentFileName + "'");
try {
Object v = doBody(prevSegmentFileName);
if (exc != null) {
message("success on fallback " + prevSegmentFileName);
}
return v;
} catch (IOException err2) {
message("secondary Exception on '" + prevSegmentFileName + "': " + err2 + "'; will retry");
}
}
}
}
}
}
/**
* Subclass must implement this. The assumption is an
* IOException will be thrown if something goes wrong
* during the processing that could have been caused by
* a writer committing.
*/
protected abstract Object doBody(String segmentFileName) throws IOException;}
} }

View File

@ -33,6 +33,7 @@ import java.util.*;
*/ */
class SegmentReader extends IndexReader { class SegmentReader extends IndexReader {
private String segment; private String segment;
private SegmentInfo si;
FieldInfos fieldInfos; FieldInfos fieldInfos;
private FieldsReader fieldsReader; private FieldsReader fieldsReader;
@ -64,22 +65,24 @@ class SegmentReader extends IndexReader {
private boolean dirty; private boolean dirty;
private int number; private int number;
private void reWrite() throws IOException { private void reWrite(SegmentInfo si) throws IOException {
// NOTE: norms are re-written in regular directory, not cfs // NOTE: norms are re-written in regular directory, not cfs
IndexOutput out = directory().createOutput(segment + ".tmp");
String oldFileName = si.getNormFileName(this.number);
if (oldFileName != null) {
// Mark this file for deletion. Note that we don't
// actually try to delete it until the new segments files is
// successfully written:
deleter.addPendingFile(oldFileName);
}
si.advanceNormGen(this.number);
IndexOutput out = directory().createOutput(si.getNormFileName(this.number));
try { try {
out.writeBytes(bytes, maxDoc()); out.writeBytes(bytes, maxDoc());
} finally { } finally {
out.close(); out.close();
} }
String fileName;
if(cfsReader == null)
fileName = segment + ".f" + number;
else{
// use a different file name if we have compound format
fileName = segment + ".s" + number;
}
directory().renameFile(segment + ".tmp", fileName);
this.dirty = false; this.dirty = false;
} }
} }
@ -133,10 +136,14 @@ class SegmentReader extends IndexReader {
private void initialize(SegmentInfo si) throws IOException { private void initialize(SegmentInfo si) throws IOException {
segment = si.name; segment = si.name;
this.si = si;
boolean success = false;
try {
// Use compound file directory for some files, if it exists // Use compound file directory for some files, if it exists
Directory cfsDir = directory(); Directory cfsDir = directory();
if (directory().fileExists(segment + ".cfs")) { if (si.getUseCompoundFile()) {
cfsReader = new CompoundFileReader(directory(), segment + ".cfs"); cfsReader = new CompoundFileReader(directory(), segment + ".cfs");
cfsDir = cfsReader; cfsDir = cfsReader;
} }
@ -148,8 +155,9 @@ class SegmentReader extends IndexReader {
tis = new TermInfosReader(cfsDir, segment, fieldInfos); tis = new TermInfosReader(cfsDir, segment, fieldInfos);
// NOTE: the bitvector is stored using the regular directory, not cfs // NOTE: the bitvector is stored using the regular directory, not cfs
if (hasDeletions(si)) if (hasDeletions(si)) {
deletedDocs = new BitVector(directory(), segment + ".del"); deletedDocs = new BitVector(directory(), si.getDelFileName());
}
// make sure that all index files have been read or are kept open // make sure that all index files have been read or are kept open
// so that if an index update removes them we'll still have them // so that if an index update removes them we'll still have them
@ -160,6 +168,18 @@ class SegmentReader extends IndexReader {
if (fieldInfos.hasVectors()) { // open term vector files only as needed if (fieldInfos.hasVectors()) { // open term vector files only as needed
termVectorsReaderOrig = new TermVectorsReader(cfsDir, segment, fieldInfos); termVectorsReaderOrig = new TermVectorsReader(cfsDir, segment, fieldInfos);
} }
success = true;
} finally {
// With lock-less commits, it's entirely possible (and
// fine) to hit a FileNotFound exception above. In
// this case, we want to explicitly close any subset
// of things that were opened so that we don't have to
// wait for a GC to do so.
if (!success) {
doClose();
}
}
} }
protected void finalize() { protected void finalize() {
@ -170,18 +190,38 @@ class SegmentReader extends IndexReader {
protected void doCommit() throws IOException { protected void doCommit() throws IOException {
if (deletedDocsDirty) { // re-write deleted if (deletedDocsDirty) { // re-write deleted
deletedDocs.write(directory(), segment + ".tmp"); String oldDelFileName = si.getDelFileName();
directory().renameFile(segment + ".tmp", segment + ".del"); if (oldDelFileName != null) {
// Mark this file for deletion. Note that we don't
// actually try to delete it until the new segments files is
// successfully written:
deleter.addPendingFile(oldDelFileName);
} }
if(undeleteAll && directory().fileExists(segment + ".del")){
directory().deleteFile(segment + ".del"); si.advanceDelGen();
// We can write directly to the actual name (vs to a
// .tmp & renaming it) because the file is not live
// until segments file is written:
deletedDocs.write(directory(), si.getDelFileName());
}
if (undeleteAll && si.hasDeletions()) {
String oldDelFileName = si.getDelFileName();
if (oldDelFileName != null) {
// Mark this file for deletion. Note that we don't
// actually try to delete it until the new segments files is
// successfully written:
deleter.addPendingFile(oldDelFileName);
}
si.clearDelGen();
} }
if (normsDirty) { // re-write norms if (normsDirty) { // re-write norms
si.setNumField(fieldInfos.size());
Enumeration values = norms.elements(); Enumeration values = norms.elements();
while (values.hasMoreElements()) { while (values.hasMoreElements()) {
Norm norm = (Norm) values.nextElement(); Norm norm = (Norm) values.nextElement();
if (norm.dirty) { if (norm.dirty) {
norm.reWrite(); norm.reWrite(si);
} }
} }
} }
@ -191,8 +231,12 @@ class SegmentReader extends IndexReader {
} }
protected void doClose() throws IOException { protected void doClose() throws IOException {
if (fieldsReader != null) {
fieldsReader.close(); fieldsReader.close();
}
if (tis != null) {
tis.close(); tis.close();
}
if (freqStream != null) if (freqStream != null)
freqStream.close(); freqStream.close();
@ -209,27 +253,19 @@ class SegmentReader extends IndexReader {
} }
static boolean hasDeletions(SegmentInfo si) throws IOException { static boolean hasDeletions(SegmentInfo si) throws IOException {
return si.dir.fileExists(si.name + ".del"); return si.hasDeletions();
} }
public boolean hasDeletions() { public boolean hasDeletions() {
return deletedDocs != null; return deletedDocs != null;
} }
static boolean usesCompoundFile(SegmentInfo si) throws IOException { static boolean usesCompoundFile(SegmentInfo si) throws IOException {
return si.dir.fileExists(si.name + ".cfs"); return si.getUseCompoundFile();
} }
static boolean hasSeparateNorms(SegmentInfo si) throws IOException { static boolean hasSeparateNorms(SegmentInfo si) throws IOException {
String[] result = si.dir.list(); return si.hasSeparateNorms();
String pattern = si.name + ".s";
int patternLength = pattern.length();
for(int i = 0; i < result.length; i++){
if(result[i].startsWith(pattern) && Character.isDigit(result[i].charAt(patternLength)))
return true;
}
return false;
} }
protected void doDelete(int docNum) { protected void doDelete(int docNum) {
@ -249,24 +285,28 @@ class SegmentReader extends IndexReader {
Vector files() throws IOException { Vector files() throws IOException {
Vector files = new Vector(16); Vector files = new Vector(16);
if (si.getUseCompoundFile()) {
String name = segment + ".cfs";
if (directory().fileExists(name)) {
files.addElement(name);
}
} else {
for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) { for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) {
String name = segment + "." + IndexFileNames.INDEX_EXTENSIONS[i]; String name = segment + "." + IndexFileNames.INDEX_EXTENSIONS[i];
if (directory().fileExists(name)) if (directory().fileExists(name))
files.addElement(name); files.addElement(name);
} }
}
if (si.hasDeletions()) {
files.addElement(si.getDelFileName());
}
for (int i = 0; i < fieldInfos.size(); i++) { for (int i = 0; i < fieldInfos.size(); i++) {
FieldInfo fi = fieldInfos.fieldInfo(i); String name = si.getNormFileName(i);
if (fi.isIndexed && !fi.omitNorms){ if (name != null && directory().fileExists(name))
String name;
if(cfsReader == null)
name = segment + ".f" + i;
else
name = segment + ".s" + i;
if (directory().fileExists(name))
files.addElement(name); files.addElement(name);
} }
}
return files; return files;
} }
@ -380,7 +420,6 @@ class SegmentReader extends IndexReader {
protected synchronized byte[] getNorms(String field) throws IOException { protected synchronized byte[] getNorms(String field) throws IOException {
Norm norm = (Norm) norms.get(field); Norm norm = (Norm) norms.get(field);
if (norm == null) return null; // not indexed, or norms not stored if (norm == null) return null; // not indexed, or norms not stored
if (norm.bytes == null) { // value not yet read if (norm.bytes == null) { // value not yet read
byte[] bytes = new byte[maxDoc()]; byte[] bytes = new byte[maxDoc()];
norms(field, bytes, 0); norms(field, bytes, 0);
@ -436,11 +475,9 @@ class SegmentReader extends IndexReader {
for (int i = 0; i < fieldInfos.size(); i++) { for (int i = 0; i < fieldInfos.size(); i++) {
FieldInfo fi = fieldInfos.fieldInfo(i); FieldInfo fi = fieldInfos.fieldInfo(i);
if (fi.isIndexed && !fi.omitNorms) { if (fi.isIndexed && !fi.omitNorms) {
// look first if there are separate norms in compound format
String fileName = segment + ".s" + fi.number;
Directory d = directory(); Directory d = directory();
if(!d.fileExists(fileName)){ String fileName = si.getNormFileName(fi.number);
fileName = segment + ".f" + fi.number; if (!si.hasSeparateNorms(fi.number)) {
d = cfsDir; d = cfsDir;
} }
norms.put(fi.name, new Norm(d.openInput(fileName), fi.number)); norms.put(fi.name, new Norm(d.openInput(fileName), fi.number));

View File

@ -128,7 +128,7 @@ public class FSDirectory extends Directory {
* @return the FSDirectory for the named file. */ * @return the FSDirectory for the named file. */
public static FSDirectory getDirectory(String path, boolean create) public static FSDirectory getDirectory(String path, boolean create)
throws IOException { throws IOException {
return getDirectory(path, create, null); return getDirectory(new File(path), create, null, true);
} }
/** Returns the directory instance for the named location, using the /** Returns the directory instance for the named location, using the
@ -143,10 +143,16 @@ public class FSDirectory extends Directory {
* @param lockFactory instance of {@link LockFactory} providing the * @param lockFactory instance of {@link LockFactory} providing the
* locking implementation. * locking implementation.
* @return the FSDirectory for the named file. */ * @return the FSDirectory for the named file. */
public static FSDirectory getDirectory(String path, boolean create,
LockFactory lockFactory, boolean doRemoveOldFiles)
throws IOException {
return getDirectory(new File(path), create, lockFactory, doRemoveOldFiles);
}
public static FSDirectory getDirectory(String path, boolean create, public static FSDirectory getDirectory(String path, boolean create,
LockFactory lockFactory) LockFactory lockFactory)
throws IOException { throws IOException {
return getDirectory(new File(path), create, lockFactory); return getDirectory(new File(path), create, lockFactory, true);
} }
/** Returns the directory instance for the named location. /** Returns the directory instance for the named location.
@ -158,9 +164,9 @@ public class FSDirectory extends Directory {
* @param file the path to the directory. * @param file the path to the directory.
* @param create if true, create, or erase any existing contents. * @param create if true, create, or erase any existing contents.
* @return the FSDirectory for the named file. */ * @return the FSDirectory for the named file. */
public static FSDirectory getDirectory(File file, boolean create) public static FSDirectory getDirectory(File file, boolean create, boolean doRemoveOldFiles)
throws IOException { throws IOException {
return getDirectory(file, create, null); return getDirectory(file, create, null, doRemoveOldFiles);
} }
/** Returns the directory instance for the named location, using the /** Returns the directory instance for the named location, using the
@ -176,7 +182,7 @@ public class FSDirectory extends Directory {
* locking implementation. * locking implementation.
* @return the FSDirectory for the named file. */ * @return the FSDirectory for the named file. */
public static FSDirectory getDirectory(File file, boolean create, public static FSDirectory getDirectory(File file, boolean create,
LockFactory lockFactory) LockFactory lockFactory, boolean doRemoveOldFiles)
throws IOException { throws IOException {
file = new File(file.getCanonicalPath()); file = new File(file.getCanonicalPath());
FSDirectory dir; FSDirectory dir;
@ -188,7 +194,7 @@ public class FSDirectory extends Directory {
} catch (Exception e) { } catch (Exception e) {
throw new RuntimeException("cannot load FSDirectory class: " + e.toString(), e); throw new RuntimeException("cannot load FSDirectory class: " + e.toString(), e);
} }
dir.init(file, create, lockFactory); dir.init(file, create, lockFactory, doRemoveOldFiles);
DIRECTORIES.put(file, dir); DIRECTORIES.put(file, dir);
} else { } else {
@ -199,7 +205,7 @@ public class FSDirectory extends Directory {
} }
if (create) { if (create) {
dir.create(); dir.create(doRemoveOldFiles);
} }
} }
} }
@ -209,23 +215,35 @@ public class FSDirectory extends Directory {
return dir; return dir;
} }
public static FSDirectory getDirectory(File file, boolean create,
LockFactory lockFactory)
throws IOException
{
return getDirectory(file, create, lockFactory, true);
}
public static FSDirectory getDirectory(File file, boolean create)
throws IOException {
return getDirectory(file, create, true);
}
private File directory = null; private File directory = null;
private int refCount; private int refCount;
protected FSDirectory() {}; // permit subclassing protected FSDirectory() {}; // permit subclassing
private void init(File path, boolean create) throws IOException { private void init(File path, boolean create, boolean doRemoveOldFiles) throws IOException {
directory = path; directory = path;
if (create) { if (create) {
create(); create(doRemoveOldFiles);
} }
if (!directory.isDirectory()) if (!directory.isDirectory())
throw new IOException(path + " not a directory"); throw new IOException(path + " not a directory");
} }
private void init(File path, boolean create, LockFactory lockFactory) throws IOException { private void init(File path, boolean create, LockFactory lockFactory, boolean doRemoveOldFiles) throws IOException {
// Set up lockFactory with cascaded defaults: if an instance was passed in, // Set up lockFactory with cascaded defaults: if an instance was passed in,
// use that; else if locks are disabled, use NoLockFactory; else if the // use that; else if locks are disabled, use NoLockFactory; else if the
@ -280,10 +298,10 @@ public class FSDirectory extends Directory {
setLockFactory(lockFactory); setLockFactory(lockFactory);
init(path, create); init(path, create, doRemoveOldFiles);
} }
private synchronized void create() throws IOException { private synchronized void create(boolean doRemoveOldFiles) throws IOException {
if (!directory.exists()) if (!directory.exists())
if (!directory.mkdirs()) if (!directory.mkdirs())
throw new IOException("Cannot create directory: " + directory); throw new IOException("Cannot create directory: " + directory);
@ -291,7 +309,8 @@ public class FSDirectory extends Directory {
if (!directory.isDirectory()) if (!directory.isDirectory())
throw new IOException(directory + " not a directory"); throw new IOException(directory + " not a directory");
String[] files = directory.list(new IndexFileNameFilter()); // clear old files if (doRemoveOldFiles) {
String[] files = directory.list(IndexFileNameFilter.getFilter()); // clear old files
if (files == null) if (files == null)
throw new IOException("Cannot read directory " + directory.getAbsolutePath()); throw new IOException("Cannot read directory " + directory.getAbsolutePath());
for (int i = 0; i < files.length; i++) { for (int i = 0; i < files.length; i++) {
@ -299,13 +318,14 @@ public class FSDirectory extends Directory {
if (!file.delete()) if (!file.delete())
throw new IOException("Cannot delete " + file); throw new IOException("Cannot delete " + file);
} }
}
lockFactory.clearAllLocks(); lockFactory.clearAllLocks();
} }
/** Returns an array of strings, one for each Lucene index file in the directory. */ /** Returns an array of strings, one for each Lucene index file in the directory. */
public String[] list() { public String[] list() {
return directory.list(new IndexFileNameFilter()); return directory.list(IndexFileNameFilter.getFilter());
} }
/** Returns true iff a file with the given name exists. */ /** Returns true iff a file with the given name exists. */

View File

@ -18,6 +18,7 @@ package org.apache.lucene.store;
*/ */
import java.io.IOException; import java.io.IOException;
import java.io.FileNotFoundException;
import java.io.File; import java.io.File;
import java.io.Serializable; import java.io.Serializable;
import java.util.Hashtable; import java.util.Hashtable;
@ -105,7 +106,7 @@ public final class RAMDirectory extends Directory implements Serializable {
} }
/** Returns an array of strings, one for each file in the directory. */ /** Returns an array of strings, one for each file in the directory. */
public final String[] list() { public synchronized final String[] list() {
String[] result = new String[files.size()]; String[] result = new String[files.size()];
int i = 0; int i = 0;
Enumeration names = files.keys(); Enumeration names = files.keys();
@ -175,8 +176,11 @@ public final class RAMDirectory extends Directory implements Serializable {
} }
/** Returns a stream reading an existing file. */ /** Returns a stream reading an existing file. */
public final IndexInput openInput(String name) { public final IndexInput openInput(String name) throws IOException {
RAMFile file = (RAMFile)files.get(name); RAMFile file = (RAMFile)files.get(name);
if (file == null) {
throw new FileNotFoundException(name);
}
return new RAMInputStream(file); return new RAMInputStream(file);
} }

View File

@ -32,6 +32,7 @@ import org.apache.lucene.document.Field;
import java.util.Collection; import java.util.Collection;
import java.io.IOException; import java.io.IOException;
import java.io.FileNotFoundException;
import java.io.File; import java.io.File;
public class TestIndexReader extends TestCase public class TestIndexReader extends TestCase
@ -222,6 +223,11 @@ public class TestIndexReader extends TestCase
assertEquals("deleted count", 100, deleted); assertEquals("deleted count", 100, deleted);
assertEquals("deleted docFreq", 100, reader.docFreq(searchTerm)); assertEquals("deleted docFreq", 100, reader.docFreq(searchTerm));
assertTermDocsCount("deleted termDocs", reader, searchTerm, 0); assertTermDocsCount("deleted termDocs", reader, searchTerm, 0);
// open a 2nd reader to make sure first reader can
// commit its changes (.del) while second reader
// is open:
IndexReader reader2 = IndexReader.open(dir);
reader.close(); reader.close();
// CREATE A NEW READER and re-test // CREATE A NEW READER and re-test
@ -231,11 +237,74 @@ public class TestIndexReader extends TestCase
reader.close(); reader.close();
} }
// Make sure you can set norms & commit even if a reader
// is open against the index:
public void testWritingNorms() throws IOException
{
String tempDir = System.getProperty("tempDir");
if (tempDir == null)
throw new IOException("tempDir undefined, cannot run test");
File indexDir = new File(tempDir, "lucenetestnormwriter");
Directory dir = FSDirectory.getDirectory(indexDir, true);
IndexWriter writer = null;
IndexReader reader = null;
Term searchTerm = new Term("content", "aaa");
// add 1 documents with term : aaa
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
addDoc(writer, searchTerm.text());
writer.close();
// now open reader & set norm for doc 0
reader = IndexReader.open(dir);
reader.setNorm(0, "content", (float) 2.0);
// we should be holding the write lock now:
assertTrue("locked", IndexReader.isLocked(dir));
reader.commit();
// we should not be holding the write lock now:
assertTrue("not locked", !IndexReader.isLocked(dir));
// open a 2nd reader:
IndexReader reader2 = IndexReader.open(dir);
// set norm again for doc 0
reader.setNorm(0, "content", (float) 3.0);
assertTrue("locked", IndexReader.isLocked(dir));
reader.close();
// we should not be holding the write lock now:
assertTrue("not locked", !IndexReader.isLocked(dir));
reader2.close();
dir.close();
rmDir(indexDir);
}
public void testDeleteReaderWriterConflictUnoptimized() throws IOException{ public void testDeleteReaderWriterConflictUnoptimized() throws IOException{
deleteReaderWriterConflict(false); deleteReaderWriterConflict(false);
} }
public void testOpenEmptyDirectory() throws IOException{
String dirName = "test.empty";
File fileDirName = new File(dirName);
if (!fileDirName.exists()) {
fileDirName.mkdir();
}
try {
IndexReader reader = IndexReader.open(fileDirName);
fail("opening IndexReader on empty directory failed to produce FileNotFoundException");
} catch (FileNotFoundException e) {
// GOOD
}
}
public void testDeleteReaderWriterConflictOptimized() throws IOException{ public void testDeleteReaderWriterConflictOptimized() throws IOException{
deleteReaderWriterConflict(true); deleteReaderWriterConflict(true);
} }
@ -368,12 +437,36 @@ public class TestIndexReader extends TestCase
assertFalse(IndexReader.isLocked(dir)); // reader only, no lock assertFalse(IndexReader.isLocked(dir)); // reader only, no lock
long version = IndexReader.lastModified(dir); long version = IndexReader.lastModified(dir);
reader.close(); reader.close();
// modify index and check version has been incremented: // modify index and check version has been
// incremented:
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true); writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
addDocumentWithFields(writer); addDocumentWithFields(writer);
writer.close(); writer.close();
reader = IndexReader.open(dir); reader = IndexReader.open(dir);
assertTrue(version < IndexReader.getCurrentVersion(dir)); assertTrue("old lastModified is " + version + "; new lastModified is " + IndexReader.lastModified(dir), version <= IndexReader.lastModified(dir));
reader.close();
}
public void testVersion() throws IOException {
assertFalse(IndexReader.indexExists("there_is_no_such_index"));
Directory dir = new RAMDirectory();
assertFalse(IndexReader.indexExists(dir));
IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
addDocumentWithFields(writer);
assertTrue(IndexReader.isLocked(dir)); // writer open, so dir is locked
writer.close();
assertTrue(IndexReader.indexExists(dir));
IndexReader reader = IndexReader.open(dir);
assertFalse(IndexReader.isLocked(dir)); // reader only, no lock
long version = IndexReader.getCurrentVersion(dir);
reader.close();
// modify index and check version has been
// incremented:
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
addDocumentWithFields(writer);
writer.close();
reader = IndexReader.open(dir);
assertTrue("old version is " + version + "; new version is " + IndexReader.getCurrentVersion(dir), version < IndexReader.getCurrentVersion(dir));
reader.close(); reader.close();
} }
@ -412,6 +505,40 @@ public class TestIndexReader extends TestCase
reader.close(); reader.close();
} }
public void testUndeleteAllAfterClose() throws IOException {
Directory dir = new RAMDirectory();
IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
addDocumentWithFields(writer);
addDocumentWithFields(writer);
writer.close();
IndexReader reader = IndexReader.open(dir);
reader.deleteDocument(0);
reader.deleteDocument(1);
reader.close();
reader = IndexReader.open(dir);
reader.undeleteAll();
assertEquals(2, reader.numDocs()); // nothing has really been deleted thanks to undeleteAll()
reader.close();
}
public void testUndeleteAllAfterCloseThenReopen() throws IOException {
Directory dir = new RAMDirectory();
IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
addDocumentWithFields(writer);
addDocumentWithFields(writer);
writer.close();
IndexReader reader = IndexReader.open(dir);
reader.deleteDocument(0);
reader.deleteDocument(1);
reader.close();
reader = IndexReader.open(dir);
reader.undeleteAll();
reader.close();
reader = IndexReader.open(dir);
assertEquals(2, reader.numDocs()); // nothing has really been deleted thanks to undeleteAll()
reader.close();
}
public void testDeleteReaderReaderConflictUnoptimized() throws IOException{ public void testDeleteReaderReaderConflictUnoptimized() throws IOException{
deleteReaderReaderConflict(false); deleteReaderReaderConflict(false);
} }
@ -562,4 +689,11 @@ public class TestIndexReader extends TestCase
doc.add(new Field("content", value, Field.Store.NO, Field.Index.TOKENIZED)); doc.add(new Field("content", value, Field.Store.NO, Field.Index.TOKENIZED));
writer.addDocument(doc); writer.addDocument(doc);
} }
private void rmDir(File dir) {
File[] files = dir.listFiles();
for (int i = 0; i < files.length; i++) {
files[i].delete();
}
dir.delete();
}
} }

View File

@ -1,6 +1,7 @@
package org.apache.lucene.index; package org.apache.lucene.index;
import java.io.IOException; import java.io.IOException;
import java.io.File;
import junit.framework.TestCase; import junit.framework.TestCase;
@ -10,7 +11,10 @@ import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.Directory; import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.store.IndexInput;
import org.apache.lucene.store.IndexOutput;
/** /**
@ -28,14 +32,11 @@ public class TestIndexWriter extends TestCase
int i; int i;
IndexWriter.setDefaultWriteLockTimeout(2000); IndexWriter.setDefaultWriteLockTimeout(2000);
IndexWriter.setDefaultCommitLockTimeout(2000);
assertEquals(2000, IndexWriter.getDefaultWriteLockTimeout()); assertEquals(2000, IndexWriter.getDefaultWriteLockTimeout());
assertEquals(2000, IndexWriter.getDefaultCommitLockTimeout());
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true); writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
IndexWriter.setDefaultWriteLockTimeout(1000); IndexWriter.setDefaultWriteLockTimeout(1000);
IndexWriter.setDefaultCommitLockTimeout(1000);
// add 100 documents // add 100 documents
for (i = 0; i < 100; i++) { for (i = 0; i < 100; i++) {
@ -72,6 +73,12 @@ public class TestIndexWriter extends TestCase
assertEquals(60, reader.maxDoc()); assertEquals(60, reader.maxDoc());
assertEquals(60, reader.numDocs()); assertEquals(60, reader.numDocs());
reader.close(); reader.close();
// make sure opening a new index for create over
// this existing one works correctly:
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
assertEquals(0, writer.docCount());
writer.close();
} }
private void addDoc(IndexWriter writer) throws IOException private void addDoc(IndexWriter writer) throws IOException
@ -80,4 +87,192 @@ public class TestIndexWriter extends TestCase
doc.add(new Field("content", "aaa", Field.Store.NO, Field.Index.TOKENIZED)); doc.add(new Field("content", "aaa", Field.Store.NO, Field.Index.TOKENIZED));
writer.addDocument(doc); writer.addDocument(doc);
} }
// Make sure we can open an index for create even when a
// reader holds it open (this fails pre lock-less
// commits on windows):
public void testCreateWithReader() throws IOException {
String tempDir = System.getProperty("java.io.tmpdir");
if (tempDir == null)
throw new IOException("java.io.tmpdir undefined, cannot run test");
File indexDir = new File(tempDir, "lucenetestindexwriter");
Directory dir = FSDirectory.getDirectory(indexDir, true);
// add one document & close writer
IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
addDoc(writer);
writer.close();
// now open reader:
IndexReader reader = IndexReader.open(dir);
assertEquals("should be one document", reader.numDocs(), 1);
// now open index for create:
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
assertEquals("should be zero documents", writer.docCount(), 0);
addDoc(writer);
writer.close();
assertEquals("should be one document", reader.numDocs(), 1);
IndexReader reader2 = IndexReader.open(dir);
assertEquals("should be one document", reader2.numDocs(), 1);
reader.close();
reader2.close();
rmDir(indexDir);
}
// Simulate a writer that crashed while writing segments
// file: make sure we can still open the index (ie,
// gracefully fallback to the previous segments file),
// and that we can add to the index:
public void testSimulatedCrashedWriter() throws IOException {
Directory dir = new RAMDirectory();
IndexWriter writer = null;
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
// add 100 documents
for (int i = 0; i < 100; i++) {
addDoc(writer);
}
// close
writer.close();
long gen = SegmentInfos.getCurrentSegmentGeneration(dir);
assertTrue("segment generation should be > 1 but got " + gen, gen > 1);
// Make the next segments file, with last byte
// missing, to simulate a writer that crashed while
// writing segments file:
String fileNameIn = SegmentInfos.getCurrentSegmentFileName(dir);
String fileNameOut = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
"",
1+gen);
IndexInput in = dir.openInput(fileNameIn);
IndexOutput out = dir.createOutput(fileNameOut);
long length = in.length();
for(int i=0;i<length-1;i++) {
out.writeByte(in.readByte());
}
in.close();
out.close();
IndexReader reader = null;
try {
reader = IndexReader.open(dir);
} catch (Exception e) {
fail("reader failed to open on a crashed index");
}
reader.close();
try {
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
} catch (Exception e) {
fail("writer failed to open on a crashed index");
}
// add 100 documents
for (int i = 0; i < 100; i++) {
addDoc(writer);
}
// close
writer.close();
}
// Simulate a corrupt index by removing last byte of
// latest segments file and make sure we get an
// IOException trying to open the index:
public void testSimulatedCorruptIndex1() throws IOException {
Directory dir = new RAMDirectory();
IndexWriter writer = null;
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
// add 100 documents
for (int i = 0; i < 100; i++) {
addDoc(writer);
}
// close
writer.close();
long gen = SegmentInfos.getCurrentSegmentGeneration(dir);
assertTrue("segment generation should be > 1 but got " + gen, gen > 1);
String fileNameIn = SegmentInfos.getCurrentSegmentFileName(dir);
String fileNameOut = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
"",
1+gen);
IndexInput in = dir.openInput(fileNameIn);
IndexOutput out = dir.createOutput(fileNameOut);
long length = in.length();
for(int i=0;i<length-1;i++) {
out.writeByte(in.readByte());
}
in.close();
out.close();
dir.deleteFile(fileNameIn);
IndexReader reader = null;
try {
reader = IndexReader.open(dir);
fail("reader did not hit IOException on opening a corrupt index");
} catch (Exception e) {
}
if (reader != null) {
reader.close();
}
}
// Simulate a corrupt index by removing one of the cfs
// files and make sure we get an IOException trying to
// open the index:
public void testSimulatedCorruptIndex2() throws IOException {
Directory dir = new RAMDirectory();
IndexWriter writer = null;
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
// add 100 documents
for (int i = 0; i < 100; i++) {
addDoc(writer);
}
// close
writer.close();
long gen = SegmentInfos.getCurrentSegmentGeneration(dir);
assertTrue("segment generation should be > 1 but got " + gen, gen > 1);
String[] files = dir.list();
for(int i=0;i<files.length;i++) {
if (files[i].endsWith(".cfs")) {
dir.deleteFile(files[i]);
break;
}
}
IndexReader reader = null;
try {
reader = IndexReader.open(dir);
fail("reader did not hit IOException on opening a corrupt index");
} catch (Exception e) {
}
if (reader != null) {
reader.close();
}
}
private void rmDir(File dir) {
File[] files = dir.listFiles();
for (int i = 0; i < files.length; i++) {
files[i].delete();
}
dir.delete();
}
} }

View File

@ -80,6 +80,21 @@ public class TestMultiReader extends TestCase {
assertEquals( 1, reader.numDocs() ); assertEquals( 1, reader.numDocs() );
reader.undeleteAll(); reader.undeleteAll();
assertEquals( 2, reader.numDocs() ); assertEquals( 2, reader.numDocs() );
// Ensure undeleteAll survives commit/close/reopen:
reader.commit();
reader.close();
sis.read(dir);
reader = new MultiReader(dir, sis, false, readers);
assertEquals( 2, reader.numDocs() );
reader.deleteDocument(0);
assertEquals( 1, reader.numDocs() );
reader.commit();
reader.close();
sis.read(dir);
reader = new MultiReader(dir, sis, false, readers);
assertEquals( 1, reader.numDocs() );
} }

View File

@ -58,9 +58,9 @@ public class TestLockFactory extends TestCase {
// Both write lock and commit lock should have been created: // Both write lock and commit lock should have been created:
assertEquals("# of unique locks created (after instantiating IndexWriter)", assertEquals("# of unique locks created (after instantiating IndexWriter)",
2, lf.locksCreated.size()); 1, lf.locksCreated.size());
assertTrue("# calls to makeLock <= 2 (after instantiating IndexWriter)", assertTrue("# calls to makeLock is 0 (after instantiating IndexWriter)",
lf.makeLockCount > 2); lf.makeLockCount >= 1);
for(Enumeration e = lf.locksCreated.keys(); e.hasMoreElements();) { for(Enumeration e = lf.locksCreated.keys(); e.hasMoreElements();) {
String lockName = (String) e.nextElement(); String lockName = (String) e.nextElement();
@ -90,6 +90,7 @@ public class TestLockFactory extends TestCase {
try { try {
writer2 = new IndexWriter(dir, new WhitespaceAnalyzer(), false); writer2 = new IndexWriter(dir, new WhitespaceAnalyzer(), false);
} catch (Exception e) { } catch (Exception e) {
e.printStackTrace(System.out);
fail("Should not have hit an IOException with no locking"); fail("Should not have hit an IOException with no locking");
} }
@ -234,6 +235,7 @@ public class TestLockFactory extends TestCase {
try { try {
writer2 = new IndexWriter(indexDirName, new WhitespaceAnalyzer(), false); writer2 = new IndexWriter(indexDirName, new WhitespaceAnalyzer(), false);
} catch (IOException e) { } catch (IOException e) {
e.printStackTrace(System.out);
fail("Should not have hit an IOException with locking disabled"); fail("Should not have hit an IOException with locking disabled");
} }
@ -266,6 +268,7 @@ public class TestLockFactory extends TestCase {
try { try {
fs2 = FSDirectory.getDirectory(indexDirName, true, lf); fs2 = FSDirectory.getDirectory(indexDirName, true, lf);
} catch (IOException e) { } catch (IOException e) {
e.printStackTrace(System.out);
fail("Should not have hit an IOException because LockFactory instances are the same"); fail("Should not have hit an IOException because LockFactory instances are the same");
} }
@ -294,7 +297,6 @@ public class TestLockFactory extends TestCase {
public void _testStressLocks(LockFactory lockFactory, String indexDirName) throws IOException { public void _testStressLocks(LockFactory lockFactory, String indexDirName) throws IOException {
FSDirectory fs1 = FSDirectory.getDirectory(indexDirName, true, lockFactory); FSDirectory fs1 = FSDirectory.getDirectory(indexDirName, true, lockFactory);
// fs1.setLockFactory(NoLockFactory.getNoLockFactory());
// First create a 1 doc index: // First create a 1 doc index:
IndexWriter w = new IndexWriter(fs1, new WhitespaceAnalyzer(), true); IndexWriter w = new IndexWriter(fs1, new WhitespaceAnalyzer(), true);
@ -405,6 +407,7 @@ public class TestLockFactory extends TestCase {
hitException = true; hitException = true;
System.out.println("Stress Test Index Writer: creation hit unexpected exception: " + e.toString()); System.out.println("Stress Test Index Writer: creation hit unexpected exception: " + e.toString());
e.printStackTrace(System.out); e.printStackTrace(System.out);
break;
} }
if (writer != null) { if (writer != null) {
try { try {
@ -413,6 +416,7 @@ public class TestLockFactory extends TestCase {
hitException = true; hitException = true;
System.out.println("Stress Test Index Writer: addDoc hit unexpected exception: " + e.toString()); System.out.println("Stress Test Index Writer: addDoc hit unexpected exception: " + e.toString());
e.printStackTrace(System.out); e.printStackTrace(System.out);
break;
} }
try { try {
writer.close(); writer.close();
@ -420,6 +424,7 @@ public class TestLockFactory extends TestCase {
hitException = true; hitException = true;
System.out.println("Stress Test Index Writer: close hit unexpected exception: " + e.toString()); System.out.println("Stress Test Index Writer: close hit unexpected exception: " + e.toString());
e.printStackTrace(System.out); e.printStackTrace(System.out);
break;
} }
writer = null; writer = null;
} }
@ -446,6 +451,7 @@ public class TestLockFactory extends TestCase {
hitException = true; hitException = true;
System.out.println("Stress Test Index Searcher: create hit unexpected exception: " + e.toString()); System.out.println("Stress Test Index Searcher: create hit unexpected exception: " + e.toString());
e.printStackTrace(System.out); e.printStackTrace(System.out);
break;
} }
if (searcher != null) { if (searcher != null) {
Hits hits = null; Hits hits = null;
@ -455,6 +461,7 @@ public class TestLockFactory extends TestCase {
hitException = true; hitException = true;
System.out.println("Stress Test Index Searcher: search hit unexpected exception: " + e.toString()); System.out.println("Stress Test Index Searcher: search hit unexpected exception: " + e.toString());
e.printStackTrace(System.out); e.printStackTrace(System.out);
break;
} }
// System.out.println(hits.length() + " total results"); // System.out.println(hits.length() + " total results");
try { try {
@ -463,6 +470,7 @@ public class TestLockFactory extends TestCase {
hitException = true; hitException = true;
System.out.println("Stress Test Index Searcher: close hit unexpected exception: " + e.toString()); System.out.println("Stress Test Index Searcher: close hit unexpected exception: " + e.toString());
e.printStackTrace(System.out); e.printStackTrace(System.out);
break;
} }
searcher = null; searcher = null;
} }

View File

@ -14,7 +14,7 @@
<p> <p>
This document defines the index file formats used This document defines the index file formats used
in Lucene version 2.0. If you are using a different in Lucene version 2.1. If you are using a different
version of Lucene, please consult the copy of version of Lucene, please consult the copy of
<code>docs/fileformats.html</code> that was distributed <code>docs/fileformats.html</code> that was distributed
with the version you are using. with the version you are using.
@ -43,6 +43,18 @@
describing how file formats have changed from prior versions. describing how file formats have changed from prior versions.
</p> </p>
<p>
In version 2.1, the file format was changed to allow
lock-less commits (ie, no more commit lock). The
change is fully backwards compatible: you can open a
pre-2.1 index for searching or adding/deleting of
docs. When the new segments file is saved
(committed), it will be written in the new file format
(meaning no specific "upgrade" process is needed).
But note that once a commit has occurred, pre-2.1
Lucene will not be able to read the index.
</p>
</section> </section>
<section name="Definitions"> <section name="Definitions">
@ -260,6 +272,18 @@
required. required.
</p> </p>
<p>
As of version 2.1 (lock-less commits), file names are
never re-used (there is one exception, "segments.gen",
see below). That is, when any file is saved to the
Directory it is given a never before used filename.
This is achieved using a simple generations approach.
For example, the first segments file is segments_1,
then segments_2, etc. The generation is a sequential
long integer represented in alpha-numeric (base 36)
form.
</p>
</section> </section>
<section name="Primitive Types"> <section name="Primitive Types">
@ -696,22 +720,48 @@
<p> <p>
The active segments in the index are stored in the The active segments in the index are stored in the
segment info file. An index only has segment info file, <tt>segments_N</tt>. There may
a single file in this format, and it is named "segments". be one or more <tt>segments_N</tt> files in the
This lists each segment by name, and also contains the size of each index; however, the one with the largest
segment. generation is the active one (when older
segments_N files are present it's because they
temporarily cannot be deleted, or, a writer is in
the process of committing). This file lists each
segment by name, has details about the separate
norms and deletion files, and also contains the
size of each segment.
</p> </p>
<p> <p>
As of 2.1, there is also a file
<tt>segments.gen</tt>. This file contains the
current generation (the <tt>_N</tt> in
<tt>segments_N</tt>) of the index. This is
used only as a fallback in case the current
generation cannot be accurately determined by
directory listing alone (as is the case for some
NFS clients with time-based directory cache
expiraation). This file simply contains an Int32
version header (SegmentInfos.FORMAT_LOCKLESS =
-2), followed by the generation recorded as Int64,
written twice.
</p>
<p>
<b>Pre-2.1:</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize&gt;<sup>SegCount</sup> Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize&gt;<sup>SegCount</sup>
</p> </p>
<p> <p>
Format, NameCounter, SegCount, SegSize --&gt; UInt32 <b>2.1 and above:</b>
Segments --&gt; Format, Version, NameCounter, SegCount, &lt;SegName, SegSize, DelGen, NumField, NormGen<sup>NumField</sup> &gt;<sup>SegCount</sup>, IsCompoundFile
</p> </p>
<p> <p>
Version --&gt; UInt64 Format, NameCounter, SegCount, SegSize, NumField --&gt; Int32
</p>
<p>
Version, DelGen, NormGen --&gt; Int64
</p> </p>
<p> <p>
@ -719,7 +769,11 @@
</p> </p>
<p> <p>
Format is -1 in Lucene 1.4. IsCompoundFile --&gt; Int8
</p>
<p>
Format is -1 as of Lucene 1.4 and -2 as of Lucene 2.1.
</p> </p>
<p> <p>
@ -740,65 +794,79 @@
SegSize is the number of documents contained in the segment index. SegSize is the number of documents contained in the segment index.
</p> </p>
<p>
DelGen is the generation count of the separate
deletes file. If this is -1, there are no
separate deletes. If it is 0, this is a pre-2.1
segment and you must check filesystem for the
existence of _X.del. Anything above zero means
there are separate deletes (_X_N.del).
</p>
<p>
NumField is the size of the array for NormGen, or
-1 if there are no NormGens stored.
</p>
<p>
NormGen records the generation of the separate
norms files. If NumField is -1, there are no
normGens stored and they are all assumed to be 0
when the segment file was written pre-2.1 and all
assumed to be -1 when the segments file is 2.1 or
above. The generation then has the same meaning
as delGen (above).
</p>
<p>
IsCompoundFile records whether the segment is
written as a compound file or not. If this is -1,
the segment is not a compound file. If it is 1,
the segment is a compound file. Else it is 0,
which means we check filesystem to see if _X.cfs
exists.
</p>
</subsection> </subsection>
<subsection name="Lock Files"> <subsection name="Lock File">
<p> <p>
Several files are used to indicate that another A write lock is used to indicate that another
process is using an index. Note that these files are not process is writing to the index. Note that this file is not
stored in the index directory itself, but rather in the stored in the index directory itself, but rather in the
system's temporary directory, as indicated in the Java system's temporary directory, as indicated in the Java
system property "java.io.tmpdir". system property "java.io.tmpdir".
</p> </p>
<ul>
<li>
<p> <p>
When a file named "commit.lock" The write lock is named "XXXX-write.lock" where
is present, a process is currently re-writing the "segments" XXXX is typically a unique prefix computed by the
file and deleting outdated segment index files, or a process is directory path to the index. When this file is
reading the "segments" present, a process is currently adding documents
file and opening the files of the segments it names. This lock file to an index, or removing files from that index.
prevents files from being deleted by another process after a process This lock file prevents several processes from
has read the "segments" attempting to modify an index at the same time.
file but before it has managed to open all of the files of the
segments named therein.
</p> </p>
</li>
<li>
<p> <p>
When a file named "write.lock" Note that prior to version 2.1, Lucene also used a
is present, a process is currently adding documents to an index, or commit lock. This was removed in 2.1.
removing files from that index. This lock file prevents several
processes from attempting to modify an index at the same time.
</p> </p>
</li>
</ul>
</subsection> </subsection>
<subsection name="Deletable File"> <subsection name="Deletable File">
<p> <p>
A file named "deletable" Prior to Lucene 2.1 there was a file "deletable"
contains the names of files that are no longer used by the index, but that contained details about files that need to be
which could not be deleted. This is only used on Win32, where a deleted. As of 2.1, a writer dynamically computes
file may not be deleted while it is still open. On other platforms the files that are deletable, instead, so no file
the file contains only null bytes. is written.
</p> </p>
<p>
Deletable --&gt; DeletableCount,
&lt;DelableName&gt;<sup>DeletableCount</sup>
</p>
<p>DeletableCount --&gt; UInt32
</p>
<p>DeletableName --&gt;
String
</p>
</subsection> </subsection>
<subsection name="Compound Files"> <subsection name="Compound Files">