mirror of https://github.com/apache/lucene.git
Lockless commits
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@476359 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
bd6f012511
commit
d634ccf4e9
|
@ -104,6 +104,15 @@ API Changes
|
||||||
9. LUCENE-657: Made FuzzyQuery non-final and inner ScoreTerm protected.
|
9. LUCENE-657: Made FuzzyQuery non-final and inner ScoreTerm protected.
|
||||||
(Steven Parkes via Otis Gospodnetic)
|
(Steven Parkes via Otis Gospodnetic)
|
||||||
|
|
||||||
|
10. LUCENE-701: Lockless commits: a commit lock is no longer required
|
||||||
|
when a writer commits and a reader opens the index. This includes
|
||||||
|
a change to the index file format (see docs/fileformats.html for
|
||||||
|
details). It also removes all APIs associated with the commit
|
||||||
|
lock & its timeout. Readers are now truly read-only and do not
|
||||||
|
block one another on startup. This is the first step to getting
|
||||||
|
Lucene to work correctly over NFS (second step is
|
||||||
|
LUCENE-710). (Mike McCandless)
|
||||||
|
|
||||||
Bug fixes
|
Bug fixes
|
||||||
|
|
||||||
1. Fixed the web application demo (built with "ant war-demo") which
|
1. Fixed the web application demo (built with "ant war-demo") which
|
||||||
|
|
|
@ -118,7 +118,7 @@ limitations under the License.
|
||||||
<blockquote>
|
<blockquote>
|
||||||
<p>
|
<p>
|
||||||
This document defines the index file formats used
|
This document defines the index file formats used
|
||||||
in Lucene version 2.0. If you are using a different
|
in Lucene version 2.1. If you are using a different
|
||||||
version of Lucene, please consult the copy of
|
version of Lucene, please consult the copy of
|
||||||
<code>docs/fileformats.html</code> that was distributed
|
<code>docs/fileformats.html</code> that was distributed
|
||||||
with the version you are using.
|
with the version you are using.
|
||||||
|
@ -142,6 +142,17 @@ limitations under the License.
|
||||||
<p>
|
<p>
|
||||||
Compatibility notes are provided in this document,
|
Compatibility notes are provided in this document,
|
||||||
describing how file formats have changed from prior versions.
|
describing how file formats have changed from prior versions.
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
In version 2.1, the file format was changed to allow
|
||||||
|
lock-less commits (ie, no more commit lock). The
|
||||||
|
change is fully backwards compatible: you can open a
|
||||||
|
pre-2.1 index for searching or adding/deleting of
|
||||||
|
docs. When the new segments file is saved
|
||||||
|
(committed), it will be written in the new file format
|
||||||
|
(meaning no specific "upgrade" process is needed).
|
||||||
|
But note that once a commit has occurred, pre-2.1
|
||||||
|
Lucene will not be able to read the index.
|
||||||
</p>
|
</p>
|
||||||
</blockquote>
|
</blockquote>
|
||||||
</p>
|
</p>
|
||||||
|
@ -403,6 +414,17 @@ limitations under the License.
|
||||||
Typically, all segments
|
Typically, all segments
|
||||||
in an index are stored in a single directory, although this is not
|
in an index are stored in a single directory, although this is not
|
||||||
required.
|
required.
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
As of version 2.1 (lock-less commits), file names are
|
||||||
|
never re-used (there is one exception, "segments.gen",
|
||||||
|
see below). That is, when any file is saved to the
|
||||||
|
Directory it is given a never before used filename.
|
||||||
|
This is achieved using a simple generations approach.
|
||||||
|
For example, the first segments file is segments_1,
|
||||||
|
then segments_2, etc. The generation is a sequential
|
||||||
|
long integer represented in alpha-numeric (base 36)
|
||||||
|
form.
|
||||||
</p>
|
</p>
|
||||||
</blockquote>
|
</blockquote>
|
||||||
</p>
|
</p>
|
||||||
|
@ -1080,25 +1102,53 @@ limitations under the License.
|
||||||
<blockquote>
|
<blockquote>
|
||||||
<p>
|
<p>
|
||||||
The active segments in the index are stored in the
|
The active segments in the index are stored in the
|
||||||
segment info file. An index only has
|
segment info file, <tt>segments_N</tt>. There may
|
||||||
a single file in this format, and it is named "segments".
|
be one or more <tt>segments_N</tt> files in the
|
||||||
This lists each segment by name, and also contains the size of each
|
index; however, the one with the largest
|
||||||
segment.
|
generation is the active one (when older
|
||||||
|
segments_N files are present it's because they
|
||||||
|
temporarily cannot be deleted, or, a writer is in
|
||||||
|
the process of committing). This file lists each
|
||||||
|
segment by name, has details about the separate
|
||||||
|
norms and deletion files, and also contains the
|
||||||
|
size of each segment.
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
|
As of 2.1, there is also a file
|
||||||
|
<tt>segments.gen</tt>. This file contains the
|
||||||
|
current generation (the <tt>_N</tt> in
|
||||||
|
<tt>segments_N</tt>) of the index. This is
|
||||||
|
used only as a fallback in case the current
|
||||||
|
generation cannot be accurately determined by
|
||||||
|
directory listing alone (as is the case for some
|
||||||
|
NFS clients with time-based directory cache
|
||||||
|
expiraation). This file simply contains an Int32
|
||||||
|
version header (SegmentInfos.FORMAT_LOCKLESS =
|
||||||
|
-2), followed by the generation recorded as Int64,
|
||||||
|
written twice.
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
<b>Pre-2.1:</b>
|
||||||
Segments --> Format, Version, NameCounter, SegCount, <SegName, SegSize><sup>SegCount</sup>
|
Segments --> Format, Version, NameCounter, SegCount, <SegName, SegSize><sup>SegCount</sup>
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
Format, NameCounter, SegCount, SegSize --> UInt32
|
<b>2.1 and above:</b>
|
||||||
|
Segments --> Format, Version, NameCounter, SegCount, <SegName, SegSize, DelGen, NumField, NormGen<sup>NumField</sup> ><sup>SegCount</sup>, IsCompoundFile
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
Version --> UInt64
|
Format, NameCounter, SegCount, SegSize, NumField --> Int32
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
Version, DelGen, NormGen --> Int64
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
SegName --> String
|
SegName --> String
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
Format is -1 in Lucene 1.4.
|
IsCompoundFile --> Int8
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
Format is -1 as of Lucene 1.4 and -2 as of Lucene 2.1.
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
Version counts how often the index has been
|
Version counts how often the index has been
|
||||||
|
@ -1113,6 +1163,35 @@ limitations under the License.
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
SegSize is the number of documents contained in the segment index.
|
SegSize is the number of documents contained in the segment index.
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
DelGen is the generation count of the separate
|
||||||
|
deletes file. If this is -1, there are no
|
||||||
|
separate deletes. If it is 0, this is a pre-2.1
|
||||||
|
segment and you must check filesystem for the
|
||||||
|
existence of _X.del. Anything above zero means
|
||||||
|
there are separate deletes (_X_N.del).
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
NumField is the size of the array for NormGen, or
|
||||||
|
-1 if there are no NormGens stored.
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
NormGen records the generation of the separate
|
||||||
|
norms files. If NumField is -1, there are no
|
||||||
|
normGens stored and they are all assumed to be 0
|
||||||
|
when the segment file was written pre-2.1 and all
|
||||||
|
assumed to be -1 when the segments file is 2.1 or
|
||||||
|
above. The generation then has the same meaning
|
||||||
|
as delGen (above).
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
IsCompoundFile records whether the segment is
|
||||||
|
written as a compound file or not. If this is -1,
|
||||||
|
the segment is not a compound file. If it is 1,
|
||||||
|
the segment is a compound file. Else it is 0,
|
||||||
|
which means we check filesystem to see if _X.cfs
|
||||||
|
exists.
|
||||||
</p>
|
</p>
|
||||||
</blockquote>
|
</blockquote>
|
||||||
</td></tr>
|
</td></tr>
|
||||||
|
@ -1121,42 +1200,31 @@ limitations under the License.
|
||||||
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
||||||
<tr><td bgcolor="#828DA6">
|
<tr><td bgcolor="#828DA6">
|
||||||
<font color="#ffffff" face="arial,helvetica,sanserif">
|
<font color="#ffffff" face="arial,helvetica,sanserif">
|
||||||
<a name="Lock Files"><strong>Lock Files</strong></a>
|
<a name="Lock File"><strong>Lock File</strong></a>
|
||||||
</font>
|
</font>
|
||||||
</td></tr>
|
</td></tr>
|
||||||
<tr><td>
|
<tr><td>
|
||||||
<blockquote>
|
<blockquote>
|
||||||
<p>
|
<p>
|
||||||
Several files are used to indicate that another
|
A write lock is used to indicate that another
|
||||||
process is using an index. Note that these files are not
|
process is writing to the index. Note that this file is not
|
||||||
stored in the index directory itself, but rather in the
|
stored in the index directory itself, but rather in the
|
||||||
system's temporary directory, as indicated in the Java
|
system's temporary directory, as indicated in the Java
|
||||||
system property "java.io.tmpdir".
|
system property "java.io.tmpdir".
|
||||||
</p>
|
</p>
|
||||||
<ul>
|
|
||||||
<li>
|
|
||||||
<p>
|
<p>
|
||||||
When a file named "commit.lock"
|
The write lock is named "XXXX-write.lock" where
|
||||||
is present, a process is currently re-writing the "segments"
|
XXXX is typically a unique prefix computed by the
|
||||||
file and deleting outdated segment index files, or a process is
|
directory path to the index. When this file is
|
||||||
reading the "segments"
|
present, a process is currently adding documents
|
||||||
file and opening the files of the segments it names. This lock file
|
to an index, or removing files from that index.
|
||||||
prevents files from being deleted by another process after a process
|
This lock file prevents several processes from
|
||||||
has read the "segments"
|
attempting to modify an index at the same time.
|
||||||
file but before it has managed to open all of the files of the
|
|
||||||
segments named therein.
|
|
||||||
</p>
|
</p>
|
||||||
</li>
|
|
||||||
|
|
||||||
<li>
|
|
||||||
<p>
|
<p>
|
||||||
When a file named "write.lock"
|
Note that prior to version 2.1, Lucene also used a
|
||||||
is present, a process is currently adding documents to an index, or
|
commit lock. This was removed in 2.1.
|
||||||
removing files from that index. This lock file prevents several
|
|
||||||
processes from attempting to modify an index at the same time.
|
|
||||||
</p>
|
</p>
|
||||||
</li>
|
|
||||||
</ul>
|
|
||||||
</blockquote>
|
</blockquote>
|
||||||
</td></tr>
|
</td></tr>
|
||||||
<tr><td><br/></td></tr>
|
<tr><td><br/></td></tr>
|
||||||
|
@ -1170,20 +1238,11 @@ limitations under the License.
|
||||||
<tr><td>
|
<tr><td>
|
||||||
<blockquote>
|
<blockquote>
|
||||||
<p>
|
<p>
|
||||||
A file named "deletable"
|
Prior to Lucene 2.1 there was a file "deletable"
|
||||||
contains the names of files that are no longer used by the index, but
|
that contained details about files that need to be
|
||||||
which could not be deleted. This is only used on Win32, where a
|
deleted. As of 2.1, a writer dynamically computes
|
||||||
file may not be deleted while it is still open. On other platforms
|
the files that are deletable, instead, so no file
|
||||||
the file contains only null bytes.
|
is written.
|
||||||
</p>
|
|
||||||
<p>
|
|
||||||
Deletable --> DeletableCount,
|
|
||||||
<DelableName><sup>DeletableCount</sup>
|
|
||||||
</p>
|
|
||||||
<p>DeletableCount --> UInt32
|
|
||||||
</p>
|
|
||||||
<p>DeletableName -->
|
|
||||||
String
|
|
||||||
</p>
|
</p>
|
||||||
</blockquote>
|
</blockquote>
|
||||||
</td></tr>
|
</td></tr>
|
||||||
|
|
|
@ -0,0 +1,219 @@
|
||||||
|
package org.apache.lucene.index;
|
||||||
|
|
||||||
|
import org.apache.lucene.index.IndexFileNames;
|
||||||
|
import org.apache.lucene.index.IndexFileNameFilter;
|
||||||
|
import org.apache.lucene.index.SegmentInfos;
|
||||||
|
import org.apache.lucene.store.Directory;
|
||||||
|
|
||||||
|
import java.io.IOException;
|
||||||
|
import java.io.PrintStream;
|
||||||
|
import java.util.Vector;
|
||||||
|
import java.util.HashMap;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* A utility class (used by both IndexReader and
|
||||||
|
* IndexWriter) to keep track of files that need to be
|
||||||
|
* deleted because they are no longer referenced by the
|
||||||
|
* index.
|
||||||
|
*/
|
||||||
|
public class IndexFileDeleter {
|
||||||
|
private Vector deletable;
|
||||||
|
private Vector pending;
|
||||||
|
private Directory directory;
|
||||||
|
private SegmentInfos segmentInfos;
|
||||||
|
private PrintStream infoStream;
|
||||||
|
|
||||||
|
public IndexFileDeleter(SegmentInfos segmentInfos, Directory directory)
|
||||||
|
throws IOException {
|
||||||
|
this.segmentInfos = segmentInfos;
|
||||||
|
this.directory = directory;
|
||||||
|
}
|
||||||
|
|
||||||
|
void setInfoStream(PrintStream infoStream) {
|
||||||
|
this.infoStream = infoStream;
|
||||||
|
}
|
||||||
|
|
||||||
|
/** Determine index files that are no longer referenced
|
||||||
|
* and therefore should be deleted. This is called once
|
||||||
|
* (by the writer), and then subsequently we add onto
|
||||||
|
* deletable any files that are no longer needed at the
|
||||||
|
* point that we create the unused file (eg when merging
|
||||||
|
* segments), and we only remove from deletable when a
|
||||||
|
* file is successfully deleted.
|
||||||
|
*/
|
||||||
|
|
||||||
|
public void findDeletableFiles() throws IOException {
|
||||||
|
|
||||||
|
// Gather all "current" segments:
|
||||||
|
HashMap current = new HashMap();
|
||||||
|
for(int j=0;j<segmentInfos.size();j++) {
|
||||||
|
SegmentInfo segmentInfo = (SegmentInfo) segmentInfos.elementAt(j);
|
||||||
|
current.put(segmentInfo.name, segmentInfo);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Then go through all files in the Directory that are
|
||||||
|
// Lucene index files, and add to deletable if they are
|
||||||
|
// not referenced by the current segments info:
|
||||||
|
|
||||||
|
String segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
|
||||||
|
IndexFileNameFilter filter = IndexFileNameFilter.getFilter();
|
||||||
|
|
||||||
|
String[] files = directory.list();
|
||||||
|
|
||||||
|
for (int i = 0; i < files.length; i++) {
|
||||||
|
|
||||||
|
if (filter.accept(null, files[i]) && !files[i].equals(segmentsInfosFileName) && !files[i].equals(IndexFileNames.SEGMENTS_GEN)) {
|
||||||
|
|
||||||
|
String segmentName;
|
||||||
|
String extension;
|
||||||
|
|
||||||
|
// First remove any extension:
|
||||||
|
int loc = files[i].indexOf('.');
|
||||||
|
if (loc != -1) {
|
||||||
|
extension = files[i].substring(1+loc);
|
||||||
|
segmentName = files[i].substring(0, loc);
|
||||||
|
} else {
|
||||||
|
extension = null;
|
||||||
|
segmentName = files[i];
|
||||||
|
}
|
||||||
|
|
||||||
|
// Then, remove any generation count:
|
||||||
|
loc = segmentName.indexOf('_', 1);
|
||||||
|
if (loc != -1) {
|
||||||
|
segmentName = segmentName.substring(0, loc);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Delete this file if it's not a "current" segment,
|
||||||
|
// or, it is a single index file but there is now a
|
||||||
|
// corresponding compound file:
|
||||||
|
boolean doDelete = false;
|
||||||
|
|
||||||
|
if (!current.containsKey(segmentName)) {
|
||||||
|
// Delete if segment is not referenced:
|
||||||
|
doDelete = true;
|
||||||
|
} else {
|
||||||
|
// OK, segment is referenced, but file may still
|
||||||
|
// be orphan'd:
|
||||||
|
SegmentInfo info = (SegmentInfo) current.get(segmentName);
|
||||||
|
|
||||||
|
if (filter.isCFSFile(files[i]) && info.getUseCompoundFile()) {
|
||||||
|
// This file is in fact stored in a CFS file for
|
||||||
|
// this segment:
|
||||||
|
doDelete = true;
|
||||||
|
} else {
|
||||||
|
|
||||||
|
if ("del".equals(extension)) {
|
||||||
|
// This is a _segmentName_N.del file:
|
||||||
|
if (!files[i].equals(info.getDelFileName())) {
|
||||||
|
// If this is a seperate .del file, but it
|
||||||
|
// doesn't match the current del filename for
|
||||||
|
// this segment, then delete it:
|
||||||
|
doDelete = true;
|
||||||
|
}
|
||||||
|
} else if (extension != null && extension.startsWith("s") && extension.matches("s\\d+")) {
|
||||||
|
int field = Integer.parseInt(extension.substring(1));
|
||||||
|
// This is a _segmentName_N.sX file:
|
||||||
|
if (!files[i].equals(info.getNormFileName(field))) {
|
||||||
|
// This is an orphan'd separate norms file:
|
||||||
|
doDelete = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (doDelete) {
|
||||||
|
addDeletableFile(files[i]);
|
||||||
|
if (infoStream != null) {
|
||||||
|
infoStream.println("IndexFileDeleter: file \"" + files[i] + "\" is unreferenced in index and will be deleted on next commit");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Some operating systems (e.g. Windows) don't permit a file to be deleted
|
||||||
|
* while it is opened for read (e.g. by another process or thread). So we
|
||||||
|
* assume that when a delete fails it is because the file is open in another
|
||||||
|
* process, and queue the file for subsequent deletion.
|
||||||
|
*/
|
||||||
|
|
||||||
|
public final void deleteSegments(Vector segments) throws IOException {
|
||||||
|
|
||||||
|
deleteFiles(); // try to delete files that we couldn't before
|
||||||
|
|
||||||
|
for (int i = 0; i < segments.size(); i++) {
|
||||||
|
SegmentReader reader = (SegmentReader)segments.elementAt(i);
|
||||||
|
if (reader.directory() == this.directory)
|
||||||
|
deleteFiles(reader.files()); // try to delete our files
|
||||||
|
else
|
||||||
|
deleteFiles(reader.files(), reader.directory()); // delete other files
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
public final void deleteFiles(Vector files, Directory directory)
|
||||||
|
throws IOException {
|
||||||
|
for (int i = 0; i < files.size(); i++)
|
||||||
|
directory.deleteFile((String)files.elementAt(i));
|
||||||
|
}
|
||||||
|
|
||||||
|
public final void deleteFiles(Vector files)
|
||||||
|
throws IOException {
|
||||||
|
deleteFiles(); // try to delete files that we couldn't before
|
||||||
|
for (int i = 0; i < files.size(); i++) {
|
||||||
|
deleteFile((String) files.elementAt(i));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
public final void deleteFile(String file)
|
||||||
|
throws IOException {
|
||||||
|
try {
|
||||||
|
directory.deleteFile(file); // try to delete each file
|
||||||
|
} catch (IOException e) { // if delete fails
|
||||||
|
if (directory.fileExists(file)) {
|
||||||
|
if (infoStream != null)
|
||||||
|
infoStream.println("IndexFileDeleter: unable to remove file \"" + file + "\": " + e.toString() + "; Will re-try later.");
|
||||||
|
addDeletableFile(file); // add to deletable
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
final void clearPendingFiles() {
|
||||||
|
pending = null;
|
||||||
|
}
|
||||||
|
|
||||||
|
final void addPendingFile(String fileName) {
|
||||||
|
if (pending == null) {
|
||||||
|
pending = new Vector();
|
||||||
|
}
|
||||||
|
pending.addElement(fileName);
|
||||||
|
}
|
||||||
|
|
||||||
|
final void commitPendingFiles() {
|
||||||
|
if (pending != null) {
|
||||||
|
if (deletable == null) {
|
||||||
|
deletable = pending;
|
||||||
|
pending = null;
|
||||||
|
} else {
|
||||||
|
deletable.addAll(pending);
|
||||||
|
pending = null;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
public final void addDeletableFile(String fileName) {
|
||||||
|
if (deletable == null) {
|
||||||
|
deletable = new Vector();
|
||||||
|
}
|
||||||
|
deletable.addElement(fileName);
|
||||||
|
}
|
||||||
|
|
||||||
|
public final void deleteFiles()
|
||||||
|
throws IOException {
|
||||||
|
if (deletable != null) {
|
||||||
|
Vector oldDeletable = deletable;
|
||||||
|
deletable = null;
|
||||||
|
deleteFiles(oldDeletable); // try to delete deletable
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
|
@ -19,6 +19,7 @@ package org.apache.lucene.index;
|
||||||
|
|
||||||
import java.io.File;
|
import java.io.File;
|
||||||
import java.io.FilenameFilter;
|
import java.io.FilenameFilter;
|
||||||
|
import java.util.HashSet;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Filename filter that accept filenames and extensions only created by Lucene.
|
* Filename filter that accept filenames and extensions only created by Lucene.
|
||||||
|
@ -28,18 +29,64 @@ import java.io.FilenameFilter;
|
||||||
*/
|
*/
|
||||||
public class IndexFileNameFilter implements FilenameFilter {
|
public class IndexFileNameFilter implements FilenameFilter {
|
||||||
|
|
||||||
|
static IndexFileNameFilter singleton = new IndexFileNameFilter();
|
||||||
|
private HashSet extensions;
|
||||||
|
|
||||||
|
public IndexFileNameFilter() {
|
||||||
|
extensions = new HashSet();
|
||||||
|
for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) {
|
||||||
|
extensions.add(IndexFileNames.INDEX_EXTENSIONS[i]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
/* (non-Javadoc)
|
/* (non-Javadoc)
|
||||||
* @see java.io.FilenameFilter#accept(java.io.File, java.lang.String)
|
* @see java.io.FilenameFilter#accept(java.io.File, java.lang.String)
|
||||||
*/
|
*/
|
||||||
public boolean accept(File dir, String name) {
|
public boolean accept(File dir, String name) {
|
||||||
for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) {
|
int i = name.lastIndexOf('.');
|
||||||
if (name.endsWith("."+IndexFileNames.INDEX_EXTENSIONS[i]))
|
if (i != -1) {
|
||||||
|
String extension = name.substring(1+i);
|
||||||
|
if (extensions.contains(extension)) {
|
||||||
|
return true;
|
||||||
|
} else if (extension.startsWith("f") &&
|
||||||
|
extension.matches("f\\d+")) {
|
||||||
|
return true;
|
||||||
|
} else if (extension.startsWith("s") &&
|
||||||
|
extension.matches("s\\d+")) {
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
} else {
|
||||||
if (name.equals(IndexFileNames.DELETABLE)) return true;
|
if (name.equals(IndexFileNames.DELETABLE)) return true;
|
||||||
else if (name.equals(IndexFileNames.SEGMENTS)) return true;
|
else if (name.startsWith(IndexFileNames.SEGMENTS)) return true;
|
||||||
else if (name.matches(".+\\.f\\d+")) return true;
|
}
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns true if this is a file that would be contained
|
||||||
|
* in a CFS file. This function should only be called on
|
||||||
|
* files that pass the above "accept" (ie, are already
|
||||||
|
* known to be a Lucene index file).
|
||||||
|
*/
|
||||||
|
public boolean isCFSFile(String name) {
|
||||||
|
int i = name.lastIndexOf('.');
|
||||||
|
if (i != -1) {
|
||||||
|
String extension = name.substring(1+i);
|
||||||
|
if (extensions.contains(extension) &&
|
||||||
|
!extension.equals("del") &&
|
||||||
|
!extension.equals("gen") &&
|
||||||
|
!extension.equals("cfs")) {
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
if (extension.startsWith("f") &&
|
||||||
|
extension.matches("f\\d+")) {
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static IndexFileNameFilter getFilter() {
|
||||||
|
return singleton;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -28,18 +28,24 @@ final class IndexFileNames {
|
||||||
/** Name of the index segment file */
|
/** Name of the index segment file */
|
||||||
static final String SEGMENTS = "segments";
|
static final String SEGMENTS = "segments";
|
||||||
|
|
||||||
/** Name of the index deletable file */
|
/** Name of the generation reference file name */
|
||||||
|
static final String SEGMENTS_GEN = "segments.gen";
|
||||||
|
|
||||||
|
/** Name of the index deletable file (only used in
|
||||||
|
* pre-lockless indices) */
|
||||||
static final String DELETABLE = "deletable";
|
static final String DELETABLE = "deletable";
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* This array contains all filename extensions used by Lucene's index files, with
|
* This array contains all filename extensions used by
|
||||||
* one exception, namely the extension made up from <code>.f</code> + a number.
|
* Lucene's index files, with two exceptions, namely the
|
||||||
* Also note that two of Lucene's files (<code>deletable</code> and
|
* extension made up from <code>.f</code> + a number and
|
||||||
* <code>segments</code>) don't have any filename extension.
|
* from <code>.s</code> + a number. Also note that
|
||||||
|
* Lucene's <code>segments_N</code> files do not have any
|
||||||
|
* filename extension.
|
||||||
*/
|
*/
|
||||||
static final String INDEX_EXTENSIONS[] = new String[] {
|
static final String INDEX_EXTENSIONS[] = new String[] {
|
||||||
"cfs", "fnm", "fdx", "fdt", "tii", "tis", "frq", "prx", "del",
|
"cfs", "fnm", "fdx", "fdt", "tii", "tis", "frq", "prx", "del",
|
||||||
"tvx", "tvd", "tvf", "tvp" };
|
"tvx", "tvd", "tvf", "tvp", "gen"};
|
||||||
|
|
||||||
/** File extensions of old-style index files */
|
/** File extensions of old-style index files */
|
||||||
static final String COMPOUND_EXTENSIONS[] = new String[] {
|
static final String COMPOUND_EXTENSIONS[] = new String[] {
|
||||||
|
@ -51,4 +57,23 @@ final class IndexFileNames {
|
||||||
"tvx", "tvd", "tvf"
|
"tvx", "tvd", "tvf"
|
||||||
};
|
};
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes the full file name from base, extension and
|
||||||
|
* generation. If the generation is -1, the file name is
|
||||||
|
* null. If it's 0, the file name is <base><extension>.
|
||||||
|
* If it's > 0, the file name is <base>_<generation><extension>.
|
||||||
|
*
|
||||||
|
* @param base -- main part of the file name
|
||||||
|
* @param extension -- extension of the filename (including .)
|
||||||
|
* @param gen -- generation
|
||||||
|
*/
|
||||||
|
public static final String fileNameFromGeneration(String base, String extension, long gen) {
|
||||||
|
if (gen == -1) {
|
||||||
|
return null;
|
||||||
|
} else if (gen == 0) {
|
||||||
|
return base + extension;
|
||||||
|
} else {
|
||||||
|
return base + "_" + Long.toString(gen, Character.MAX_RADIX) + extension;
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -113,6 +113,7 @@ public abstract class IndexReader {
|
||||||
private Directory directory;
|
private Directory directory;
|
||||||
private boolean directoryOwner;
|
private boolean directoryOwner;
|
||||||
private boolean closeDirectory;
|
private boolean closeDirectory;
|
||||||
|
protected IndexFileDeleter deleter;
|
||||||
|
|
||||||
private SegmentInfos segmentInfos;
|
private SegmentInfos segmentInfos;
|
||||||
private Lock writeLock;
|
private Lock writeLock;
|
||||||
|
@ -138,25 +139,41 @@ public abstract class IndexReader {
|
||||||
}
|
}
|
||||||
|
|
||||||
private static IndexReader open(final Directory directory, final boolean closeDirectory) throws IOException {
|
private static IndexReader open(final Directory directory, final boolean closeDirectory) throws IOException {
|
||||||
synchronized (directory) { // in- & inter-process sync
|
|
||||||
return (IndexReader)new Lock.With(
|
return (IndexReader) new SegmentInfos.FindSegmentsFile(directory) {
|
||||||
directory.makeLock(IndexWriter.COMMIT_LOCK_NAME),
|
|
||||||
IndexWriter.COMMIT_LOCK_TIMEOUT) {
|
public Object doBody(String segmentFileName) throws IOException {
|
||||||
public Object doBody() throws IOException {
|
|
||||||
SegmentInfos infos = new SegmentInfos();
|
SegmentInfos infos = new SegmentInfos();
|
||||||
infos.read(directory);
|
infos.read(directory, segmentFileName);
|
||||||
|
|
||||||
if (infos.size() == 1) { // index is optimized
|
if (infos.size() == 1) { // index is optimized
|
||||||
return SegmentReader.get(infos, infos.info(0), closeDirectory);
|
return SegmentReader.get(infos, infos.info(0), closeDirectory);
|
||||||
}
|
} else {
|
||||||
IndexReader[] readers = new IndexReader[infos.size()];
|
|
||||||
for (int i = 0; i < infos.size(); i++)
|
|
||||||
readers[i] = SegmentReader.get(infos.info(i));
|
|
||||||
return new MultiReader(directory, infos, closeDirectory, readers);
|
|
||||||
|
|
||||||
|
// To reduce the chance of hitting FileNotFound
|
||||||
|
// (and having to retry), we open segments in
|
||||||
|
// reverse because IndexWriter merges & deletes
|
||||||
|
// the newest segments first.
|
||||||
|
|
||||||
|
IndexReader[] readers = new IndexReader[infos.size()];
|
||||||
|
for (int i = infos.size()-1; i >= 0; i--) {
|
||||||
|
try {
|
||||||
|
readers[i] = SegmentReader.get(infos.info(i));
|
||||||
|
} catch (IOException e) {
|
||||||
|
// Close all readers we had opened:
|
||||||
|
for(i++;i<infos.size();i++) {
|
||||||
|
readers[i].close();
|
||||||
|
}
|
||||||
|
throw e;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return new MultiReader(directory, infos, closeDirectory, readers);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}.run();
|
}.run();
|
||||||
}
|
}
|
||||||
}
|
|
||||||
|
|
||||||
/** Returns the directory this index resides in. */
|
/** Returns the directory this index resides in. */
|
||||||
public Directory directory() { return directory; }
|
public Directory directory() { return directory; }
|
||||||
|
@ -175,8 +192,12 @@ public abstract class IndexReader {
|
||||||
* Do not use this to check whether the reader is still up-to-date, use
|
* Do not use this to check whether the reader is still up-to-date, use
|
||||||
* {@link #isCurrent()} instead.
|
* {@link #isCurrent()} instead.
|
||||||
*/
|
*/
|
||||||
public static long lastModified(File directory) throws IOException {
|
public static long lastModified(File fileDirectory) throws IOException {
|
||||||
return FSDirectory.fileModified(directory, IndexFileNames.SEGMENTS);
|
return ((Long) new SegmentInfos.FindSegmentsFile(fileDirectory) {
|
||||||
|
public Object doBody(String segmentFileName) {
|
||||||
|
return new Long(FSDirectory.fileModified(fileDirectory, segmentFileName));
|
||||||
|
}
|
||||||
|
}.run()).longValue();
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -184,8 +205,12 @@ public abstract class IndexReader {
|
||||||
* Do not use this to check whether the reader is still up-to-date, use
|
* Do not use this to check whether the reader is still up-to-date, use
|
||||||
* {@link #isCurrent()} instead.
|
* {@link #isCurrent()} instead.
|
||||||
*/
|
*/
|
||||||
public static long lastModified(Directory directory) throws IOException {
|
public static long lastModified(final Directory directory2) throws IOException {
|
||||||
return directory.fileModified(IndexFileNames.SEGMENTS);
|
return ((Long) new SegmentInfos.FindSegmentsFile(directory2) {
|
||||||
|
public Object doBody(String segmentFileName) throws IOException {
|
||||||
|
return new Long(directory2.fileModified(segmentFileName));
|
||||||
|
}
|
||||||
|
}.run()).longValue();
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -227,21 +252,7 @@ public abstract class IndexReader {
|
||||||
* @throws IOException if segments file cannot be read.
|
* @throws IOException if segments file cannot be read.
|
||||||
*/
|
*/
|
||||||
public static long getCurrentVersion(Directory directory) throws IOException {
|
public static long getCurrentVersion(Directory directory) throws IOException {
|
||||||
synchronized (directory) { // in- & inter-process sync
|
|
||||||
Lock commitLock=directory.makeLock(IndexWriter.COMMIT_LOCK_NAME);
|
|
||||||
|
|
||||||
boolean locked=false;
|
|
||||||
|
|
||||||
try {
|
|
||||||
locked=commitLock.obtain(IndexWriter.COMMIT_LOCK_TIMEOUT);
|
|
||||||
|
|
||||||
return SegmentInfos.readCurrentVersion(directory);
|
return SegmentInfos.readCurrentVersion(directory);
|
||||||
} finally {
|
|
||||||
if (locked) {
|
|
||||||
commitLock.release();
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -259,21 +270,7 @@ public abstract class IndexReader {
|
||||||
* @throws IOException
|
* @throws IOException
|
||||||
*/
|
*/
|
||||||
public boolean isCurrent() throws IOException {
|
public boolean isCurrent() throws IOException {
|
||||||
synchronized (directory) { // in- & inter-process sync
|
|
||||||
Lock commitLock=directory.makeLock(IndexWriter.COMMIT_LOCK_NAME);
|
|
||||||
|
|
||||||
boolean locked=false;
|
|
||||||
|
|
||||||
try {
|
|
||||||
locked=commitLock.obtain(IndexWriter.COMMIT_LOCK_TIMEOUT);
|
|
||||||
|
|
||||||
return SegmentInfos.readCurrentVersion(directory) == segmentInfos.getVersion();
|
return SegmentInfos.readCurrentVersion(directory) == segmentInfos.getVersion();
|
||||||
} finally {
|
|
||||||
if (locked) {
|
|
||||||
commitLock.release();
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -319,7 +316,7 @@ public abstract class IndexReader {
|
||||||
* @return <code>true</code> if an index exists; <code>false</code> otherwise
|
* @return <code>true</code> if an index exists; <code>false</code> otherwise
|
||||||
*/
|
*/
|
||||||
public static boolean indexExists(String directory) {
|
public static boolean indexExists(String directory) {
|
||||||
return (new File(directory, IndexFileNames.SEGMENTS)).exists();
|
return indexExists(new File(directory));
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -328,8 +325,9 @@ public abstract class IndexReader {
|
||||||
* @param directory the directory to check for an index
|
* @param directory the directory to check for an index
|
||||||
* @return <code>true</code> if an index exists; <code>false</code> otherwise
|
* @return <code>true</code> if an index exists; <code>false</code> otherwise
|
||||||
*/
|
*/
|
||||||
|
|
||||||
public static boolean indexExists(File directory) {
|
public static boolean indexExists(File directory) {
|
||||||
return (new File(directory, IndexFileNames.SEGMENTS)).exists();
|
return SegmentInfos.getCurrentSegmentGeneration(directory.list()) != -1;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -340,7 +338,7 @@ public abstract class IndexReader {
|
||||||
* @throws IOException if there is a problem with accessing the index
|
* @throws IOException if there is a problem with accessing the index
|
||||||
*/
|
*/
|
||||||
public static boolean indexExists(Directory directory) throws IOException {
|
public static boolean indexExists(Directory directory) throws IOException {
|
||||||
return directory.fileExists(IndexFileNames.SEGMENTS);
|
return SegmentInfos.getCurrentSegmentGeneration(directory) != -1;
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Returns the number of documents in this index. */
|
/** Returns the number of documents in this index. */
|
||||||
|
@ -592,17 +590,22 @@ public abstract class IndexReader {
|
||||||
*/
|
*/
|
||||||
protected final synchronized void commit() throws IOException{
|
protected final synchronized void commit() throws IOException{
|
||||||
if(hasChanges){
|
if(hasChanges){
|
||||||
|
if (deleter == null) {
|
||||||
|
// In the MultiReader case, we share this deleter
|
||||||
|
// across all SegmentReaders:
|
||||||
|
setDeleter(new IndexFileDeleter(segmentInfos, directory));
|
||||||
|
deleter.deleteFiles();
|
||||||
|
}
|
||||||
if(directoryOwner){
|
if(directoryOwner){
|
||||||
synchronized (directory) { // in- & inter-process sync
|
deleter.clearPendingFiles();
|
||||||
new Lock.With(directory.makeLock(IndexWriter.COMMIT_LOCK_NAME),
|
|
||||||
IndexWriter.COMMIT_LOCK_TIMEOUT) {
|
|
||||||
public Object doBody() throws IOException {
|
|
||||||
doCommit();
|
doCommit();
|
||||||
|
String oldInfoFileName = segmentInfos.getCurrentSegmentFileName();
|
||||||
segmentInfos.write(directory);
|
segmentInfos.write(directory);
|
||||||
return null;
|
// Attempt to delete all files we just obsoleted:
|
||||||
}
|
|
||||||
}.run();
|
deleter.deleteFile(oldInfoFileName);
|
||||||
}
|
deleter.commitPendingFiles();
|
||||||
|
deleter.deleteFiles();
|
||||||
if (writeLock != null) {
|
if (writeLock != null) {
|
||||||
writeLock.release(); // release write lock
|
writeLock.release(); // release write lock
|
||||||
writeLock = null;
|
writeLock = null;
|
||||||
|
@ -614,6 +617,13 @@ public abstract class IndexReader {
|
||||||
hasChanges = false;
|
hasChanges = false;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
protected void setDeleter(IndexFileDeleter deleter) {
|
||||||
|
this.deleter = deleter;
|
||||||
|
}
|
||||||
|
protected IndexFileDeleter getDeleter() {
|
||||||
|
return deleter;
|
||||||
|
}
|
||||||
|
|
||||||
/** Implements commit. */
|
/** Implements commit. */
|
||||||
protected abstract void doCommit() throws IOException;
|
protected abstract void doCommit() throws IOException;
|
||||||
|
|
||||||
|
@ -658,8 +668,7 @@ public abstract class IndexReader {
|
||||||
*/
|
*/
|
||||||
public static boolean isLocked(Directory directory) throws IOException {
|
public static boolean isLocked(Directory directory) throws IOException {
|
||||||
return
|
return
|
||||||
directory.makeLock(IndexWriter.WRITE_LOCK_NAME).isLocked() ||
|
directory.makeLock(IndexWriter.WRITE_LOCK_NAME).isLocked();
|
||||||
directory.makeLock(IndexWriter.COMMIT_LOCK_NAME).isLocked();
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -684,7 +693,6 @@ public abstract class IndexReader {
|
||||||
*/
|
*/
|
||||||
public static void unlock(Directory directory) throws IOException {
|
public static void unlock(Directory directory) throws IOException {
|
||||||
directory.makeLock(IndexWriter.WRITE_LOCK_NAME).release();
|
directory.makeLock(IndexWriter.WRITE_LOCK_NAME).release();
|
||||||
directory.makeLock(IndexWriter.COMMIT_LOCK_NAME).release();
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|
|
@ -67,16 +67,7 @@ public class IndexWriter {
|
||||||
|
|
||||||
private long writeLockTimeout = WRITE_LOCK_TIMEOUT;
|
private long writeLockTimeout = WRITE_LOCK_TIMEOUT;
|
||||||
|
|
||||||
/**
|
|
||||||
* Default value for the commit lock timeout (10,000).
|
|
||||||
* @see #setDefaultCommitLockTimeout
|
|
||||||
*/
|
|
||||||
public static long COMMIT_LOCK_TIMEOUT = 10000;
|
|
||||||
|
|
||||||
private long commitLockTimeout = COMMIT_LOCK_TIMEOUT;
|
|
||||||
|
|
||||||
public static final String WRITE_LOCK_NAME = "write.lock";
|
public static final String WRITE_LOCK_NAME = "write.lock";
|
||||||
public static final String COMMIT_LOCK_NAME = "commit.lock";
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Default value is 10. Change using {@link #setMergeFactor(int)}.
|
* Default value is 10. Change using {@link #setMergeFactor(int)}.
|
||||||
|
@ -111,6 +102,7 @@ public class IndexWriter {
|
||||||
private SegmentInfos segmentInfos = new SegmentInfos(); // the segments
|
private SegmentInfos segmentInfos = new SegmentInfos(); // the segments
|
||||||
private SegmentInfos ramSegmentInfos = new SegmentInfos(); // the segments in ramDirectory
|
private SegmentInfos ramSegmentInfos = new SegmentInfos(); // the segments in ramDirectory
|
||||||
private final Directory ramDirectory = new RAMDirectory(); // for temp segs
|
private final Directory ramDirectory = new RAMDirectory(); // for temp segs
|
||||||
|
private IndexFileDeleter deleter;
|
||||||
|
|
||||||
private Lock writeLock;
|
private Lock writeLock;
|
||||||
|
|
||||||
|
@ -260,19 +252,30 @@ public class IndexWriter {
|
||||||
this.writeLock = writeLock; // save it
|
this.writeLock = writeLock; // save it
|
||||||
|
|
||||||
try {
|
try {
|
||||||
synchronized (directory) { // in- & inter-process sync
|
if (create) {
|
||||||
new Lock.With(directory.makeLock(IndexWriter.COMMIT_LOCK_NAME), commitLockTimeout) {
|
// Try to read first. This is to allow create
|
||||||
public Object doBody() throws IOException {
|
// against an index that's currently open for
|
||||||
if (create)
|
// searching. In this case we write the next
|
||||||
segmentInfos.write(directory);
|
// segments_N file with no segments:
|
||||||
else
|
try {
|
||||||
segmentInfos.read(directory);
|
segmentInfos.read(directory);
|
||||||
return null;
|
segmentInfos.clear();
|
||||||
}
|
} catch (IOException e) {
|
||||||
}.run();
|
// Likely this means it's a fresh directory
|
||||||
}
|
}
|
||||||
|
segmentInfos.write(directory);
|
||||||
|
} else {
|
||||||
|
segmentInfos.read(directory);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create a deleter to keep track of which files can
|
||||||
|
// be deleted:
|
||||||
|
deleter = new IndexFileDeleter(segmentInfos, directory);
|
||||||
|
deleter.setInfoStream(infoStream);
|
||||||
|
deleter.findDeletableFiles();
|
||||||
|
deleter.deleteFiles();
|
||||||
|
|
||||||
} catch (IOException e) {
|
} catch (IOException e) {
|
||||||
// the doBody method failed
|
|
||||||
this.writeLock.release();
|
this.writeLock.release();
|
||||||
this.writeLock = null;
|
this.writeLock = null;
|
||||||
throw e;
|
throw e;
|
||||||
|
@ -380,35 +383,6 @@ public class IndexWriter {
|
||||||
return infoStream;
|
return infoStream;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
|
||||||
* Sets the maximum time to wait for a commit lock (in milliseconds) for this instance of IndexWriter. @see
|
|
||||||
* @see #setDefaultCommitLockTimeout to change the default value for all instances of IndexWriter.
|
|
||||||
*/
|
|
||||||
public void setCommitLockTimeout(long commitLockTimeout) {
|
|
||||||
this.commitLockTimeout = commitLockTimeout;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* @see #setCommitLockTimeout
|
|
||||||
*/
|
|
||||||
public long getCommitLockTimeout() {
|
|
||||||
return commitLockTimeout;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Sets the default (for any instance of IndexWriter) maximum time to wait for a commit lock (in milliseconds)
|
|
||||||
*/
|
|
||||||
public static void setDefaultCommitLockTimeout(long commitLockTimeout) {
|
|
||||||
IndexWriter.COMMIT_LOCK_TIMEOUT = commitLockTimeout;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* @see #setDefaultCommitLockTimeout
|
|
||||||
*/
|
|
||||||
public static long getDefaultCommitLockTimeout() {
|
|
||||||
return IndexWriter.COMMIT_LOCK_TIMEOUT;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Sets the maximum time to wait for a write lock (in milliseconds) for this instance of IndexWriter. @see
|
* Sets the maximum time to wait for a write lock (in milliseconds) for this instance of IndexWriter. @see
|
||||||
* @see #setDefaultWriteLockTimeout to change the default value for all instances of IndexWriter.
|
* @see #setDefaultWriteLockTimeout to change the default value for all instances of IndexWriter.
|
||||||
|
@ -517,7 +491,7 @@ public class IndexWriter {
|
||||||
String segmentName = newRAMSegmentName();
|
String segmentName = newRAMSegmentName();
|
||||||
dw.addDocument(segmentName, doc);
|
dw.addDocument(segmentName, doc);
|
||||||
synchronized (this) {
|
synchronized (this) {
|
||||||
ramSegmentInfos.addElement(new SegmentInfo(segmentName, 1, ramDirectory));
|
ramSegmentInfos.addElement(new SegmentInfo(segmentName, 1, ramDirectory, false));
|
||||||
maybeFlushRamSegments();
|
maybeFlushRamSegments();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -790,36 +764,26 @@ public class IndexWriter {
|
||||||
int docCount = merger.merge(); // merge 'em
|
int docCount = merger.merge(); // merge 'em
|
||||||
|
|
||||||
segmentInfos.setSize(0); // pop old infos & add new
|
segmentInfos.setSize(0); // pop old infos & add new
|
||||||
segmentInfos.addElement(new SegmentInfo(mergedName, docCount, directory));
|
SegmentInfo info = new SegmentInfo(mergedName, docCount, directory, false);
|
||||||
|
segmentInfos.addElement(info);
|
||||||
|
|
||||||
if(sReader != null)
|
if(sReader != null)
|
||||||
sReader.close();
|
sReader.close();
|
||||||
|
|
||||||
synchronized (directory) { // in- & inter-process sync
|
String segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
|
||||||
new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
|
|
||||||
public Object doBody() throws IOException {
|
|
||||||
segmentInfos.write(directory); // commit changes
|
segmentInfos.write(directory); // commit changes
|
||||||
return null;
|
|
||||||
}
|
|
||||||
}.run();
|
|
||||||
}
|
|
||||||
|
|
||||||
deleteSegments(segmentsToDelete); // delete now-unused segments
|
deleter.deleteFile(segmentsInfosFileName); // delete old segments_N file
|
||||||
|
deleter.deleteSegments(segmentsToDelete); // delete now-unused segments
|
||||||
|
|
||||||
if (useCompoundFile) {
|
if (useCompoundFile) {
|
||||||
final Vector filesToDelete = merger.createCompoundFile(mergedName + ".tmp");
|
Vector filesToDelete = merger.createCompoundFile(mergedName + ".cfs");
|
||||||
synchronized (directory) { // in- & inter-process sync
|
segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
|
||||||
new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
|
info.setUseCompoundFile(true);
|
||||||
public Object doBody() throws IOException {
|
segmentInfos.write(directory); // commit again so readers know we've switched this segment to a compound file
|
||||||
// make compound file visible for SegmentReaders
|
|
||||||
directory.renameFile(mergedName + ".tmp", mergedName + ".cfs");
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
}.run();
|
|
||||||
}
|
|
||||||
|
|
||||||
// delete now unused files of segment
|
deleter.deleteFile(segmentsInfosFileName); // delete old segments_N file
|
||||||
deleteFiles(filesToDelete);
|
deleter.deleteFiles(filesToDelete); // delete now unused files of segment
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -937,6 +901,7 @@ public class IndexWriter {
|
||||||
*/
|
*/
|
||||||
private final int mergeSegments(SegmentInfos sourceSegments, int minSegment, int end)
|
private final int mergeSegments(SegmentInfos sourceSegments, int minSegment, int end)
|
||||||
throws IOException {
|
throws IOException {
|
||||||
|
|
||||||
final String mergedName = newSegmentName();
|
final String mergedName = newSegmentName();
|
||||||
if (infoStream != null) infoStream.print("merging segments");
|
if (infoStream != null) infoStream.print("merging segments");
|
||||||
SegmentMerger merger = new SegmentMerger(this, mergedName);
|
SegmentMerger merger = new SegmentMerger(this, mergedName);
|
||||||
|
@ -960,7 +925,7 @@ public class IndexWriter {
|
||||||
}
|
}
|
||||||
|
|
||||||
SegmentInfo newSegment = new SegmentInfo(mergedName, mergedDocCount,
|
SegmentInfo newSegment = new SegmentInfo(mergedName, mergedDocCount,
|
||||||
directory);
|
directory, false);
|
||||||
if (sourceSegments == ramSegmentInfos) {
|
if (sourceSegments == ramSegmentInfos) {
|
||||||
sourceSegments.removeAllElements();
|
sourceSegments.removeAllElements();
|
||||||
segmentInfos.addElement(newSegment);
|
segmentInfos.addElement(newSegment);
|
||||||
|
@ -973,115 +938,26 @@ public class IndexWriter {
|
||||||
// close readers before we attempt to delete now-obsolete segments
|
// close readers before we attempt to delete now-obsolete segments
|
||||||
merger.closeReaders();
|
merger.closeReaders();
|
||||||
|
|
||||||
synchronized (directory) { // in- & inter-process sync
|
String segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
|
||||||
new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
|
|
||||||
public Object doBody() throws IOException {
|
|
||||||
segmentInfos.write(directory); // commit before deleting
|
segmentInfos.write(directory); // commit before deleting
|
||||||
return null;
|
|
||||||
}
|
|
||||||
}.run();
|
|
||||||
}
|
|
||||||
|
|
||||||
deleteSegments(segmentsToDelete); // delete now-unused segments
|
deleter.deleteFile(segmentsInfosFileName); // delete old segments_N file
|
||||||
|
deleter.deleteSegments(segmentsToDelete); // delete now-unused segments
|
||||||
|
|
||||||
if (useCompoundFile) {
|
if (useCompoundFile) {
|
||||||
final Vector filesToDelete = merger.createCompoundFile(mergedName + ".tmp");
|
Vector filesToDelete = merger.createCompoundFile(mergedName + ".cfs");
|
||||||
synchronized (directory) { // in- & inter-process sync
|
|
||||||
new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), commitLockTimeout) {
|
|
||||||
public Object doBody() throws IOException {
|
|
||||||
// make compound file visible for SegmentReaders
|
|
||||||
directory.renameFile(mergedName + ".tmp", mergedName + ".cfs");
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
}.run();
|
|
||||||
}
|
|
||||||
|
|
||||||
// delete now unused files of segment
|
segmentsInfosFileName = segmentInfos.getCurrentSegmentFileName();
|
||||||
deleteFiles(filesToDelete);
|
newSegment.setUseCompoundFile(true);
|
||||||
|
segmentInfos.write(directory); // commit again so readers know we've switched this segment to a compound file
|
||||||
|
|
||||||
|
deleter.deleteFile(segmentsInfosFileName); // delete old segments_N file
|
||||||
|
deleter.deleteFiles(filesToDelete); // delete now-unused segments
|
||||||
}
|
}
|
||||||
|
|
||||||
return mergedDocCount;
|
return mergedDocCount;
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
|
||||||
* Some operating systems (e.g. Windows) don't permit a file to be deleted
|
|
||||||
* while it is opened for read (e.g. by another process or thread). So we
|
|
||||||
* assume that when a delete fails it is because the file is open in another
|
|
||||||
* process, and queue the file for subsequent deletion.
|
|
||||||
*/
|
|
||||||
|
|
||||||
private final void deleteSegments(Vector segments) throws IOException {
|
|
||||||
Vector deletable = new Vector();
|
|
||||||
|
|
||||||
deleteFiles(readDeleteableFiles(), deletable); // try to delete deleteable
|
|
||||||
|
|
||||||
for (int i = 0; i < segments.size(); i++) {
|
|
||||||
SegmentReader reader = (SegmentReader)segments.elementAt(i);
|
|
||||||
if (reader.directory() == this.directory)
|
|
||||||
deleteFiles(reader.files(), deletable); // try to delete our files
|
|
||||||
else
|
|
||||||
deleteFiles(reader.files(), reader.directory()); // delete other files
|
|
||||||
}
|
|
||||||
|
|
||||||
writeDeleteableFiles(deletable); // note files we can't delete
|
|
||||||
}
|
|
||||||
|
|
||||||
private final void deleteFiles(Vector files) throws IOException {
|
|
||||||
Vector deletable = new Vector();
|
|
||||||
deleteFiles(readDeleteableFiles(), deletable); // try to delete deleteable
|
|
||||||
deleteFiles(files, deletable); // try to delete our files
|
|
||||||
writeDeleteableFiles(deletable); // note files we can't delete
|
|
||||||
}
|
|
||||||
|
|
||||||
private final void deleteFiles(Vector files, Directory directory)
|
|
||||||
throws IOException {
|
|
||||||
for (int i = 0; i < files.size(); i++)
|
|
||||||
directory.deleteFile((String)files.elementAt(i));
|
|
||||||
}
|
|
||||||
|
|
||||||
private final void deleteFiles(Vector files, Vector deletable)
|
|
||||||
throws IOException {
|
|
||||||
for (int i = 0; i < files.size(); i++) {
|
|
||||||
String file = (String)files.elementAt(i);
|
|
||||||
try {
|
|
||||||
directory.deleteFile(file); // try to delete each file
|
|
||||||
} catch (IOException e) { // if delete fails
|
|
||||||
if (directory.fileExists(file)) {
|
|
||||||
if (infoStream != null)
|
|
||||||
infoStream.println(e.toString() + "; Will re-try later.");
|
|
||||||
deletable.addElement(file); // add to deletable
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private final Vector readDeleteableFiles() throws IOException {
|
|
||||||
Vector result = new Vector();
|
|
||||||
if (!directory.fileExists(IndexFileNames.DELETABLE))
|
|
||||||
return result;
|
|
||||||
|
|
||||||
IndexInput input = directory.openInput(IndexFileNames.DELETABLE);
|
|
||||||
try {
|
|
||||||
for (int i = input.readInt(); i > 0; i--) // read file names
|
|
||||||
result.addElement(input.readString());
|
|
||||||
} finally {
|
|
||||||
input.close();
|
|
||||||
}
|
|
||||||
return result;
|
|
||||||
}
|
|
||||||
|
|
||||||
private final void writeDeleteableFiles(Vector files) throws IOException {
|
|
||||||
IndexOutput output = directory.createOutput("deleteable.new");
|
|
||||||
try {
|
|
||||||
output.writeInt(files.size());
|
|
||||||
for (int i = 0; i < files.size(); i++)
|
|
||||||
output.writeString((String)files.elementAt(i));
|
|
||||||
} finally {
|
|
||||||
output.close();
|
|
||||||
}
|
|
||||||
directory.renameFile("deleteable.new", IndexFileNames.DELETABLE);
|
|
||||||
}
|
|
||||||
|
|
||||||
private final boolean checkNonDecreasingLevels(int start) {
|
private final boolean checkNonDecreasingLevels(int start) {
|
||||||
int lowerBound = -1;
|
int lowerBound = -1;
|
||||||
int upperBound = minMergeDocs;
|
int upperBound = minMergeDocs;
|
||||||
|
|
|
@ -218,6 +218,13 @@ public class MultiReader extends IndexReader {
|
||||||
return new MultiTermPositions(subReaders, starts);
|
return new MultiTermPositions(subReaders, starts);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
protected void setDeleter(IndexFileDeleter deleter) {
|
||||||
|
// Share deleter to our SegmentReaders:
|
||||||
|
this.deleter = deleter;
|
||||||
|
for (int i = 0; i < subReaders.length; i++)
|
||||||
|
subReaders[i].setDeleter(deleter);
|
||||||
|
}
|
||||||
|
|
||||||
protected void doCommit() throws IOException {
|
protected void doCommit() throws IOException {
|
||||||
for (int i = 0; i < subReaders.length; i++)
|
for (int i = 0; i < subReaders.length; i++)
|
||||||
subReaders[i].commit();
|
subReaders[i].commit();
|
||||||
|
|
|
@ -18,15 +18,302 @@ package org.apache.lucene.index;
|
||||||
*/
|
*/
|
||||||
|
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
|
import org.apache.lucene.store.IndexOutput;
|
||||||
|
import org.apache.lucene.store.IndexInput;
|
||||||
|
import java.io.IOException;
|
||||||
|
|
||||||
final class SegmentInfo {
|
final class SegmentInfo {
|
||||||
public String name; // unique name in dir
|
public String name; // unique name in dir
|
||||||
public int docCount; // number of docs in seg
|
public int docCount; // number of docs in seg
|
||||||
public Directory dir; // where segment resides
|
public Directory dir; // where segment resides
|
||||||
|
|
||||||
|
private boolean preLockless; // true if this is a segments file written before
|
||||||
|
// lock-less commits (XXX)
|
||||||
|
|
||||||
|
private long delGen; // current generation of del file; -1 if there
|
||||||
|
// are no deletes; 0 if it's a pre-XXX segment
|
||||||
|
// (and we must check filesystem); 1 or higher if
|
||||||
|
// there are deletes at generation N
|
||||||
|
|
||||||
|
private long[] normGen; // current generations of each field's norm file.
|
||||||
|
// If this array is null, we must check filesystem
|
||||||
|
// when preLockLess is true. Else,
|
||||||
|
// there are no separate norms
|
||||||
|
|
||||||
|
private byte isCompoundFile; // -1 if it is not; 1 if it is; 0 if it's
|
||||||
|
// pre-XXX (ie, must check file system to see
|
||||||
|
// if <name>.cfs exists)
|
||||||
|
|
||||||
public SegmentInfo(String name, int docCount, Directory dir) {
|
public SegmentInfo(String name, int docCount, Directory dir) {
|
||||||
this.name = name;
|
this.name = name;
|
||||||
this.docCount = docCount;
|
this.docCount = docCount;
|
||||||
this.dir = dir;
|
this.dir = dir;
|
||||||
|
delGen = -1;
|
||||||
|
isCompoundFile = 0;
|
||||||
|
preLockless = true;
|
||||||
|
}
|
||||||
|
public SegmentInfo(String name, int docCount, Directory dir, boolean isCompoundFile) {
|
||||||
|
this(name, docCount, dir);
|
||||||
|
if (isCompoundFile) {
|
||||||
|
this.isCompoundFile = 1;
|
||||||
|
} else {
|
||||||
|
this.isCompoundFile = -1;
|
||||||
|
}
|
||||||
|
preLockless = false;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Construct a new SegmentInfo instance by reading a
|
||||||
|
* previously saved SegmentInfo from input.
|
||||||
|
*
|
||||||
|
* @param dir directory to load from
|
||||||
|
* @param format format of the segments info file
|
||||||
|
* @param input input handle to read segment info from
|
||||||
|
*/
|
||||||
|
public SegmentInfo(Directory dir, int format, IndexInput input) throws IOException {
|
||||||
|
this.dir = dir;
|
||||||
|
name = input.readString();
|
||||||
|
docCount = input.readInt();
|
||||||
|
if (format <= SegmentInfos.FORMAT_LOCKLESS) {
|
||||||
|
delGen = input.readLong();
|
||||||
|
int numNormGen = input.readInt();
|
||||||
|
if (numNormGen == -1) {
|
||||||
|
normGen = null;
|
||||||
|
} else {
|
||||||
|
normGen = new long[numNormGen];
|
||||||
|
for(int j=0;j<numNormGen;j++) {
|
||||||
|
normGen[j] = input.readLong();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
isCompoundFile = input.readByte();
|
||||||
|
preLockless = isCompoundFile == 0;
|
||||||
|
} else {
|
||||||
|
delGen = 0;
|
||||||
|
normGen = null;
|
||||||
|
isCompoundFile = 0;
|
||||||
|
preLockless = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void setNumField(int numField) {
|
||||||
|
if (normGen == null) {
|
||||||
|
// normGen is null if we loaded a pre-XXX segment
|
||||||
|
// file, or, if this segments file hasn't had any
|
||||||
|
// norms set against it yet:
|
||||||
|
normGen = new long[numField];
|
||||||
|
|
||||||
|
if (!preLockless) {
|
||||||
|
// This is a FORMAT_LOCKLESS segment, which means
|
||||||
|
// there are no norms:
|
||||||
|
for(int i=0;i<numField;i++) {
|
||||||
|
normGen[i] = -1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
boolean hasDeletions()
|
||||||
|
throws IOException {
|
||||||
|
// Cases:
|
||||||
|
//
|
||||||
|
// delGen == -1: this means this segment was written
|
||||||
|
// by the LOCKLESS code and for certain does not have
|
||||||
|
// deletions yet
|
||||||
|
//
|
||||||
|
// delGen == 0: this means this segment was written by
|
||||||
|
// pre-LOCKLESS code which means we must check
|
||||||
|
// directory to see if .del file exists
|
||||||
|
//
|
||||||
|
// delGen > 0: this means this segment was written by
|
||||||
|
// the LOCKLESS code and for certain has
|
||||||
|
// deletions
|
||||||
|
//
|
||||||
|
if (delGen == -1) {
|
||||||
|
return false;
|
||||||
|
} else if (delGen > 0) {
|
||||||
|
return true;
|
||||||
|
} else {
|
||||||
|
return dir.fileExists(getDelFileName());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void advanceDelGen() {
|
||||||
|
// delGen 0 is reserved for pre-LOCKLESS format
|
||||||
|
if (delGen == -1) {
|
||||||
|
delGen = 1;
|
||||||
|
} else {
|
||||||
|
delGen++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void clearDelGen() {
|
||||||
|
delGen = -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
String getDelFileName() {
|
||||||
|
if (delGen == -1) {
|
||||||
|
// In this case we know there is no deletion filename
|
||||||
|
// against this segment
|
||||||
|
return null;
|
||||||
|
} else {
|
||||||
|
// If delGen is 0, it's the pre-lockless-commit file format
|
||||||
|
return IndexFileNames.fileNameFromGeneration(name, ".del", delGen);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns true if this field for this segment has saved a separate norms file (_<segment>_N.sX).
|
||||||
|
*
|
||||||
|
* @param fieldNumber the field index to check
|
||||||
|
*/
|
||||||
|
boolean hasSeparateNorms(int fieldNumber)
|
||||||
|
throws IOException {
|
||||||
|
if ((normGen == null && preLockless) || (normGen != null && normGen[fieldNumber] == 0)) {
|
||||||
|
// Must fallback to directory file exists check:
|
||||||
|
String fileName = name + ".s" + fieldNumber;
|
||||||
|
return dir.fileExists(fileName);
|
||||||
|
} else if (normGen == null || normGen[fieldNumber] == -1) {
|
||||||
|
return false;
|
||||||
|
} else {
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns true if any fields in this segment have separate norms.
|
||||||
|
*/
|
||||||
|
boolean hasSeparateNorms()
|
||||||
|
throws IOException {
|
||||||
|
if (normGen == null) {
|
||||||
|
if (!preLockless) {
|
||||||
|
// This means we were created w/ LOCKLESS code and no
|
||||||
|
// norms are written yet:
|
||||||
|
return false;
|
||||||
|
} else {
|
||||||
|
// This means this segment was saved with pre-LOCKLESS
|
||||||
|
// code. So we must fallback to the original
|
||||||
|
// directory list check:
|
||||||
|
String[] result = dir.list();
|
||||||
|
String pattern;
|
||||||
|
pattern = name + ".s";
|
||||||
|
int patternLength = pattern.length();
|
||||||
|
for(int i = 0; i < result.length; i++){
|
||||||
|
if(result[i].startsWith(pattern) && Character.isDigit(result[i].charAt(patternLength)))
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
// This means this segment was saved with LOCKLESS
|
||||||
|
// code so we first check whether any normGen's are >
|
||||||
|
// 0 (meaning they definitely have separate norms):
|
||||||
|
for(int i=0;i<normGen.length;i++) {
|
||||||
|
if (normGen[i] > 0) {
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// Next we look for any == 0. These cases were
|
||||||
|
// pre-LOCKLESS and must be checked in directory:
|
||||||
|
for(int i=0;i<normGen.length;i++) {
|
||||||
|
if (normGen[i] == 0) {
|
||||||
|
if (dir.fileExists(getNormFileName(i))) {
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Increment the generation count for the norms file for
|
||||||
|
* this field.
|
||||||
|
*
|
||||||
|
* @param fieldIndex field whose norm file will be rewritten
|
||||||
|
*/
|
||||||
|
void advanceNormGen(int fieldIndex) {
|
||||||
|
if (normGen[fieldIndex] == -1) {
|
||||||
|
normGen[fieldIndex] = 1;
|
||||||
|
} else {
|
||||||
|
normGen[fieldIndex]++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Get the file name for the norms file for this field.
|
||||||
|
*
|
||||||
|
* @param number field index
|
||||||
|
*/
|
||||||
|
String getNormFileName(int number) throws IOException {
|
||||||
|
String prefix;
|
||||||
|
|
||||||
|
long gen;
|
||||||
|
if (normGen == null) {
|
||||||
|
gen = 0;
|
||||||
|
} else {
|
||||||
|
gen = normGen[number];
|
||||||
|
}
|
||||||
|
|
||||||
|
if (hasSeparateNorms(number)) {
|
||||||
|
prefix = ".s";
|
||||||
|
return IndexFileNames.fileNameFromGeneration(name, prefix + number, gen);
|
||||||
|
} else {
|
||||||
|
prefix = ".f";
|
||||||
|
return IndexFileNames.fileNameFromGeneration(name, prefix + number, 0);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Mark whether this segment is stored as a compound file.
|
||||||
|
*
|
||||||
|
* @param isCompoundFile true if this is a compound file;
|
||||||
|
* else, false
|
||||||
|
*/
|
||||||
|
void setUseCompoundFile(boolean isCompoundFile) {
|
||||||
|
if (isCompoundFile) {
|
||||||
|
this.isCompoundFile = 1;
|
||||||
|
} else {
|
||||||
|
this.isCompoundFile = -1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns true if this segment is stored as a compound
|
||||||
|
* file; else, false.
|
||||||
|
*
|
||||||
|
* @param directory directory to check. This parameter is
|
||||||
|
* only used when the segment was written before version
|
||||||
|
* XXX (at which point compound file or not became stored
|
||||||
|
* in the segments info file).
|
||||||
|
*/
|
||||||
|
boolean getUseCompoundFile() throws IOException {
|
||||||
|
if (isCompoundFile == -1) {
|
||||||
|
return false;
|
||||||
|
} else if (isCompoundFile == 1) {
|
||||||
|
return true;
|
||||||
|
} else {
|
||||||
|
return dir.fileExists(name + ".cfs");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Save this segment's info.
|
||||||
|
*/
|
||||||
|
void write(IndexOutput output)
|
||||||
|
throws IOException {
|
||||||
|
output.writeString(name);
|
||||||
|
output.writeInt(docCount);
|
||||||
|
output.writeLong(delGen);
|
||||||
|
if (normGen == null) {
|
||||||
|
output.writeInt(-1);
|
||||||
|
} else {
|
||||||
|
output.writeInt(normGen.length);
|
||||||
|
for(int j=0;j<normGen.length;j++) {
|
||||||
|
output.writeLong(normGen[j]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
output.writeByte(isCompoundFile);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -19,36 +19,151 @@ package org.apache.lucene.index;
|
||||||
|
|
||||||
import java.util.Vector;
|
import java.util.Vector;
|
||||||
import java.io.IOException;
|
import java.io.IOException;
|
||||||
|
import java.io.PrintStream;
|
||||||
|
import java.io.File;
|
||||||
|
import java.io.FileNotFoundException;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.store.IndexInput;
|
import org.apache.lucene.store.IndexInput;
|
||||||
import org.apache.lucene.store.IndexOutput;
|
import org.apache.lucene.store.IndexOutput;
|
||||||
import org.apache.lucene.util.Constants;
|
import org.apache.lucene.util.Constants;
|
||||||
|
|
||||||
final class SegmentInfos extends Vector {
|
public final class SegmentInfos extends Vector {
|
||||||
|
|
||||||
/** The file format version, a negative number. */
|
/** The file format version, a negative number. */
|
||||||
/* Works since counter, the old 1st entry, is always >= 0 */
|
/* Works since counter, the old 1st entry, is always >= 0 */
|
||||||
public static final int FORMAT = -1;
|
public static final int FORMAT = -1;
|
||||||
|
|
||||||
|
/** This is the current file format written. It differs
|
||||||
|
* slightly from the previous format in that file names
|
||||||
|
* are never re-used (write once). Instead, each file is
|
||||||
|
* written to the next generation. For example,
|
||||||
|
* segments_1, segments_2, etc. This allows us to not use
|
||||||
|
* a commit lock. See <a
|
||||||
|
* href="http://lucene.apache.org/java/docs/fileformats.html">file
|
||||||
|
* formats</a> for details.
|
||||||
|
*/
|
||||||
|
public static final int FORMAT_LOCKLESS = -2;
|
||||||
|
|
||||||
public int counter = 0; // used to name new segments
|
public int counter = 0; // used to name new segments
|
||||||
/**
|
/**
|
||||||
* counts how often the index has been changed by adding or deleting docs.
|
* counts how often the index has been changed by adding or deleting docs.
|
||||||
* starting with the current time in milliseconds forces to create unique version numbers.
|
* starting with the current time in milliseconds forces to create unique version numbers.
|
||||||
*/
|
*/
|
||||||
private long version = System.currentTimeMillis();
|
private long version = System.currentTimeMillis();
|
||||||
|
private long generation = 0; // generation of the "segments_N" file we read
|
||||||
|
|
||||||
|
/**
|
||||||
|
* If non-null, information about loading segments_N files
|
||||||
|
* will be printed here. @see #setInfoStream.
|
||||||
|
*/
|
||||||
|
private static PrintStream infoStream;
|
||||||
|
|
||||||
public final SegmentInfo info(int i) {
|
public final SegmentInfo info(int i) {
|
||||||
return (SegmentInfo) elementAt(i);
|
return (SegmentInfo) elementAt(i);
|
||||||
}
|
}
|
||||||
|
|
||||||
public final void read(Directory directory) throws IOException {
|
/**
|
||||||
|
* Get the generation (N) of the current segments_N file
|
||||||
|
* from a list of files.
|
||||||
|
*
|
||||||
|
* @param files -- array of file names to check
|
||||||
|
*/
|
||||||
|
public static long getCurrentSegmentGeneration(String[] files) {
|
||||||
|
if (files == null) {
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
long max = -1;
|
||||||
|
int prefixLen = IndexFileNames.SEGMENTS.length()+1;
|
||||||
|
for (int i = 0; i < files.length; i++) {
|
||||||
|
String file = files[i];
|
||||||
|
if (file.startsWith(IndexFileNames.SEGMENTS) && !file.equals(IndexFileNames.SEGMENTS_GEN)) {
|
||||||
|
if (file.equals(IndexFileNames.SEGMENTS)) {
|
||||||
|
// Pre lock-less commits:
|
||||||
|
if (max == -1) {
|
||||||
|
max = 0;
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
long v = Long.parseLong(file.substring(prefixLen), Character.MAX_RADIX);
|
||||||
|
if (v > max) {
|
||||||
|
max = v;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return max;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Get the generation (N) of the current segments_N file
|
||||||
|
* in the directory.
|
||||||
|
*
|
||||||
|
* @param directory -- directory to search for the latest segments_N file
|
||||||
|
*/
|
||||||
|
public static long getCurrentSegmentGeneration(Directory directory) throws IOException {
|
||||||
|
String[] files = directory.list();
|
||||||
|
if (files == null)
|
||||||
|
throw new IOException("Cannot read directory " + directory);
|
||||||
|
return getCurrentSegmentGeneration(files);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Get the filename of the current segments_N file
|
||||||
|
* from a list of files.
|
||||||
|
*
|
||||||
|
* @param files -- array of file names to check
|
||||||
|
*/
|
||||||
|
|
||||||
|
public static String getCurrentSegmentFileName(String[] files) throws IOException {
|
||||||
|
return IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||||
|
"",
|
||||||
|
getCurrentSegmentGeneration(files));
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Get the filename of the current segments_N file
|
||||||
|
* in the directory.
|
||||||
|
*
|
||||||
|
* @param directory -- directory to search for the latest segments_N file
|
||||||
|
*/
|
||||||
|
public static String getCurrentSegmentFileName(Directory directory) throws IOException {
|
||||||
|
return IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||||
|
"",
|
||||||
|
getCurrentSegmentGeneration(directory));
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Get the segment_N filename in use by this segment infos.
|
||||||
|
*/
|
||||||
|
public String getCurrentSegmentFileName() {
|
||||||
|
return IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||||
|
"",
|
||||||
|
generation);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Read a particular segmentFileName. Note that this may
|
||||||
|
* throw an IOException if a commit is in process.
|
||||||
|
*
|
||||||
|
* @param directory -- directory containing the segments file
|
||||||
|
* @param segmentFileName -- segment file to load
|
||||||
|
*/
|
||||||
|
public final void read(Directory directory, String segmentFileName) throws IOException {
|
||||||
|
boolean success = false;
|
||||||
|
|
||||||
|
IndexInput input = directory.openInput(segmentFileName);
|
||||||
|
|
||||||
|
if (segmentFileName.equals(IndexFileNames.SEGMENTS)) {
|
||||||
|
generation = 0;
|
||||||
|
} else {
|
||||||
|
generation = Long.parseLong(segmentFileName.substring(1+IndexFileNames.SEGMENTS.length()),
|
||||||
|
Character.MAX_RADIX);
|
||||||
|
}
|
||||||
|
|
||||||
IndexInput input = directory.openInput(IndexFileNames.SEGMENTS);
|
|
||||||
try {
|
try {
|
||||||
int format = input.readInt();
|
int format = input.readInt();
|
||||||
if(format < 0){ // file contains explicit format info
|
if(format < 0){ // file contains explicit format info
|
||||||
// check that it is a format we can understand
|
// check that it is a format we can understand
|
||||||
if (format < FORMAT)
|
if (format < FORMAT_LOCKLESS)
|
||||||
throw new IOException("Unknown format version: " + format);
|
throw new IOException("Unknown format version: " + format);
|
||||||
version = input.readLong(); // read version
|
version = input.readLong(); // read version
|
||||||
counter = input.readInt(); // read counter
|
counter = input.readInt(); // read counter
|
||||||
|
@ -58,9 +173,7 @@ final class SegmentInfos extends Vector {
|
||||||
}
|
}
|
||||||
|
|
||||||
for (int i = input.readInt(); i > 0; i--) { // read segmentInfos
|
for (int i = input.readInt(); i > 0; i--) { // read segmentInfos
|
||||||
SegmentInfo si =
|
addElement(new SegmentInfo(directory, format, input));
|
||||||
new SegmentInfo(input.readString(), input.readInt(), directory);
|
|
||||||
addElement(si);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
if(format >= 0){ // in old format the version number may be at the end of the file
|
if(format >= 0){ // in old format the version number may be at the end of the file
|
||||||
|
@ -69,31 +182,71 @@ final class SegmentInfos extends Vector {
|
||||||
else
|
else
|
||||||
version = input.readLong(); // read version
|
version = input.readLong(); // read version
|
||||||
}
|
}
|
||||||
|
success = true;
|
||||||
}
|
}
|
||||||
finally {
|
finally {
|
||||||
input.close();
|
input.close();
|
||||||
|
if (!success) {
|
||||||
|
// Clear any segment infos we had loaded so we
|
||||||
|
// have a clean slate on retry:
|
||||||
|
clear();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
}
|
||||||
|
/**
|
||||||
|
* This version of read uses the retry logic (for lock-less
|
||||||
|
* commits) to find the right segments file to load.
|
||||||
|
*/
|
||||||
|
public final void read(Directory directory) throws IOException {
|
||||||
|
|
||||||
|
generation = -1;
|
||||||
|
|
||||||
|
new FindSegmentsFile(directory) {
|
||||||
|
|
||||||
|
public Object doBody(String segmentFileName) throws IOException {
|
||||||
|
read(directory, segmentFileName);
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
}.run();
|
||||||
|
}
|
||||||
|
|
||||||
public final void write(Directory directory) throws IOException {
|
public final void write(Directory directory) throws IOException {
|
||||||
IndexOutput output = directory.createOutput("segments.new");
|
|
||||||
|
// Always advance the generation on write:
|
||||||
|
if (generation == -1) {
|
||||||
|
generation = 1;
|
||||||
|
} else {
|
||||||
|
generation++;
|
||||||
|
}
|
||||||
|
|
||||||
|
String segmentFileName = getCurrentSegmentFileName();
|
||||||
|
IndexOutput output = directory.createOutput(segmentFileName);
|
||||||
|
|
||||||
try {
|
try {
|
||||||
output.writeInt(FORMAT); // write FORMAT
|
output.writeInt(FORMAT_LOCKLESS); // write FORMAT
|
||||||
output.writeLong(++version); // every write changes the index
|
output.writeLong(++version); // every write changes
|
||||||
|
// the index
|
||||||
output.writeInt(counter); // write counter
|
output.writeInt(counter); // write counter
|
||||||
output.writeInt(size()); // write infos
|
output.writeInt(size()); // write infos
|
||||||
for (int i = 0; i < size(); i++) {
|
for (int i = 0; i < size(); i++) {
|
||||||
SegmentInfo si = info(i);
|
SegmentInfo si = info(i);
|
||||||
output.writeString(si.name);
|
si.write(output);
|
||||||
output.writeInt(si.docCount);
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
finally {
|
finally {
|
||||||
output.close();
|
output.close();
|
||||||
}
|
}
|
||||||
|
|
||||||
// install new segment info
|
try {
|
||||||
directory.renameFile("segments.new", IndexFileNames.SEGMENTS);
|
output = directory.createOutput(IndexFileNames.SEGMENTS_GEN);
|
||||||
|
output.writeInt(FORMAT_LOCKLESS);
|
||||||
|
output.writeLong(generation);
|
||||||
|
output.writeLong(generation);
|
||||||
|
output.close();
|
||||||
|
} catch (IOException e) {
|
||||||
|
// It's OK if we fail to write this file since it's
|
||||||
|
// used only as one of the retry fallbacks.
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -109,13 +262,17 @@ final class SegmentInfos extends Vector {
|
||||||
public static long readCurrentVersion(Directory directory)
|
public static long readCurrentVersion(Directory directory)
|
||||||
throws IOException {
|
throws IOException {
|
||||||
|
|
||||||
IndexInput input = directory.openInput(IndexFileNames.SEGMENTS);
|
return ((Long) new FindSegmentsFile(directory) {
|
||||||
|
public Object doBody(String segmentFileName) throws IOException {
|
||||||
|
|
||||||
|
IndexInput input = directory.openInput(segmentFileName);
|
||||||
|
|
||||||
int format = 0;
|
int format = 0;
|
||||||
long version = 0;
|
long version = 0;
|
||||||
try {
|
try {
|
||||||
format = input.readInt();
|
format = input.readInt();
|
||||||
if(format < 0){
|
if(format < 0){
|
||||||
if (format < FORMAT)
|
if (format < FORMAT_LOCKLESS)
|
||||||
throw new IOException("Unknown format version: " + format);
|
throw new IOException("Unknown format version: " + format);
|
||||||
version = input.readLong(); // read version
|
version = input.readLong(); // read version
|
||||||
}
|
}
|
||||||
|
@ -125,13 +282,301 @@ final class SegmentInfos extends Vector {
|
||||||
}
|
}
|
||||||
|
|
||||||
if(format < 0)
|
if(format < 0)
|
||||||
return version;
|
return new Long(version);
|
||||||
|
|
||||||
// We cannot be sure about the format of the file.
|
// We cannot be sure about the format of the file.
|
||||||
// Therefore we have to read the whole file and cannot simply seek to the version entry.
|
// Therefore we have to read the whole file and cannot simply seek to the version entry.
|
||||||
|
|
||||||
SegmentInfos sis = new SegmentInfos();
|
SegmentInfos sis = new SegmentInfos();
|
||||||
sis.read(directory);
|
sis.read(directory, segmentFileName);
|
||||||
return sis.getVersion();
|
return new Long(sis.getVersion());
|
||||||
|
}
|
||||||
|
}.run()).longValue();
|
||||||
|
}
|
||||||
|
|
||||||
|
/** If non-null, information about retries when loading
|
||||||
|
* the segments file will be printed to this.
|
||||||
|
*/
|
||||||
|
public static void setInfoStream(PrintStream infoStream) {
|
||||||
|
SegmentInfos.infoStream = infoStream;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Advanced configuration of retry logic in loading
|
||||||
|
segments_N file */
|
||||||
|
private static int defaultGenFileRetryCount = 10;
|
||||||
|
private static int defaultGenFileRetryPauseMsec = 50;
|
||||||
|
private static int defaultGenLookaheadCount = 10;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Advanced: set how many times to try loading the
|
||||||
|
* segments.gen file contents to determine current segment
|
||||||
|
* generation. This file is only referenced when the
|
||||||
|
* primary method (listing the directory) fails.
|
||||||
|
*/
|
||||||
|
public static void setDefaultGenFileRetryCount(int count) {
|
||||||
|
defaultGenFileRetryCount = count;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* @see #setDefaultGenFileRetryCount
|
||||||
|
*/
|
||||||
|
public static int getDefaultGenFileRetryCount() {
|
||||||
|
return defaultGenFileRetryCount;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Advanced: set how many milliseconds to pause in between
|
||||||
|
* attempts to load the segments.gen file.
|
||||||
|
*/
|
||||||
|
public static void setDefaultGenFileRetryPauseMsec(int msec) {
|
||||||
|
defaultGenFileRetryPauseMsec = msec;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* @see #setDefaultGenFileRetryPauseMsec
|
||||||
|
*/
|
||||||
|
public static int getDefaultGenFileRetryPauseMsec() {
|
||||||
|
return defaultGenFileRetryPauseMsec;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Advanced: set how many times to try incrementing the
|
||||||
|
* gen when loading the segments file. This only runs if
|
||||||
|
* the primary (listing directory) and secondary (opening
|
||||||
|
* segments.gen file) methods fail to find the segments
|
||||||
|
* file.
|
||||||
|
*/
|
||||||
|
public static void setDefaultGenLookaheadCount(int count) {
|
||||||
|
defaultGenLookaheadCount = count;
|
||||||
|
}
|
||||||
|
/**
|
||||||
|
* @see #setDefaultGenLookaheadCount
|
||||||
|
*/
|
||||||
|
public static int getDefaultGenLookahedCount() {
|
||||||
|
return defaultGenLookaheadCount;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* @see #setInfoStream
|
||||||
|
*/
|
||||||
|
public static PrintStream getInfoStream() {
|
||||||
|
return infoStream;
|
||||||
|
}
|
||||||
|
|
||||||
|
private static void message(String message) {
|
||||||
|
if (infoStream != null) {
|
||||||
|
infoStream.println(Thread.currentThread().getName() + ": " + message);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Utility class for executing code that needs to do
|
||||||
|
* something with the current segments file. This is
|
||||||
|
* necessary with lock-less commits because from the time
|
||||||
|
* you locate the current segments file name, until you
|
||||||
|
* actually open it, read its contents, or check modified
|
||||||
|
* time, etc., it could have been deleted due to a writer
|
||||||
|
* commit finishing.
|
||||||
|
*/
|
||||||
|
public abstract static class FindSegmentsFile {
|
||||||
|
|
||||||
|
File fileDirectory;
|
||||||
|
Directory directory;
|
||||||
|
|
||||||
|
public FindSegmentsFile(File directory) {
|
||||||
|
this.fileDirectory = directory;
|
||||||
|
}
|
||||||
|
|
||||||
|
public FindSegmentsFile(Directory directory) {
|
||||||
|
this.directory = directory;
|
||||||
|
}
|
||||||
|
|
||||||
|
public Object run() throws IOException {
|
||||||
|
String segmentFileName = null;
|
||||||
|
long lastGen = -1;
|
||||||
|
long gen = 0;
|
||||||
|
int genLookaheadCount = 0;
|
||||||
|
IOException exc = null;
|
||||||
|
boolean retry = false;
|
||||||
|
|
||||||
|
int method = 0;
|
||||||
|
|
||||||
|
// Loop until we succeed in calling doBody() without
|
||||||
|
// hitting an IOException. An IOException most likely
|
||||||
|
// means a commit was in process and has finished, in
|
||||||
|
// the time it took us to load the now-old infos files
|
||||||
|
// (and segments files). It's also possible it's a
|
||||||
|
// true error (corrupt index). To distinguish these,
|
||||||
|
// on each retry we must see "forward progress" on
|
||||||
|
// which generation we are trying to load. If we
|
||||||
|
// don't, then the original error is real and we throw
|
||||||
|
// it.
|
||||||
|
|
||||||
|
// We have three methods for determining the current
|
||||||
|
// generation. We try each in sequence.
|
||||||
|
|
||||||
|
while(true) {
|
||||||
|
|
||||||
|
// Method 1: list the directory and use the highest
|
||||||
|
// segments_N file. This method works well as long
|
||||||
|
// as there is no stale caching on the directory
|
||||||
|
// contents:
|
||||||
|
String[] files = null;
|
||||||
|
|
||||||
|
if (0 == method) {
|
||||||
|
if (directory != null) {
|
||||||
|
files = directory.list();
|
||||||
|
} else {
|
||||||
|
files = fileDirectory.list();
|
||||||
|
}
|
||||||
|
|
||||||
|
gen = getCurrentSegmentGeneration(files);
|
||||||
|
|
||||||
|
if (gen == -1) {
|
||||||
|
String s = "";
|
||||||
|
for(int i=0;i<files.length;i++) {
|
||||||
|
s += " " + files[i];
|
||||||
|
}
|
||||||
|
throw new FileNotFoundException("no segments* file found: files:" + s);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Method 2 (fallback if Method 1 isn't reliable):
|
||||||
|
// if the directory listing seems to be stale, then
|
||||||
|
// try loading the "segments.gen" file.
|
||||||
|
if (1 == method || (0 == method && lastGen == gen && retry)) {
|
||||||
|
|
||||||
|
method = 1;
|
||||||
|
|
||||||
|
for(int i=0;i<defaultGenFileRetryCount;i++) {
|
||||||
|
IndexInput genInput = null;
|
||||||
|
try {
|
||||||
|
genInput = directory.openInput(IndexFileNames.SEGMENTS_GEN);
|
||||||
|
} catch (IOException e) {
|
||||||
|
message("segments.gen open: IOException " + e);
|
||||||
|
}
|
||||||
|
if (genInput != null) {
|
||||||
|
|
||||||
|
try {
|
||||||
|
int version = genInput.readInt();
|
||||||
|
if (version == FORMAT_LOCKLESS) {
|
||||||
|
long gen0 = genInput.readLong();
|
||||||
|
long gen1 = genInput.readLong();
|
||||||
|
message("fallback check: " + gen0 + "; " + gen1);
|
||||||
|
if (gen0 == gen1) {
|
||||||
|
// The file is consistent.
|
||||||
|
if (gen0 > gen) {
|
||||||
|
message("fallback to '" + IndexFileNames.SEGMENTS_GEN + "' check: now try generation " + gen0 + " > " + gen);
|
||||||
|
gen = gen0;
|
||||||
|
}
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} catch (IOException err2) {
|
||||||
|
// will retry
|
||||||
|
} finally {
|
||||||
|
genInput.close();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
Thread.sleep(defaultGenFileRetryPauseMsec);
|
||||||
|
} catch (InterruptedException e) {
|
||||||
|
// will retry
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Method 3 (fallback if Methods 2 & 3 are not
|
||||||
|
// reliabel): since both directory cache and file
|
||||||
|
// contents cache seem to be stale, just advance the
|
||||||
|
// generation.
|
||||||
|
if (2 == method || (1 == method && lastGen == gen && retry)) {
|
||||||
|
|
||||||
|
method = 2;
|
||||||
|
|
||||||
|
if (genLookaheadCount < defaultGenLookaheadCount) {
|
||||||
|
gen++;
|
||||||
|
genLookaheadCount++;
|
||||||
|
message("look ahead incremenent gen to " + gen);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (lastGen == gen) {
|
||||||
|
|
||||||
|
// This means we're about to try the same
|
||||||
|
// segments_N last tried. This is allowed,
|
||||||
|
// exactly once, because writer could have been in
|
||||||
|
// the process of writing segments_N last time.
|
||||||
|
|
||||||
|
if (retry) {
|
||||||
|
// OK, we've tried the same segments_N file
|
||||||
|
// twice in a row, so this must be a real
|
||||||
|
// error. We throw the original exception we
|
||||||
|
// got.
|
||||||
|
throw exc;
|
||||||
|
} else {
|
||||||
|
retry = true;
|
||||||
|
}
|
||||||
|
|
||||||
|
} else {
|
||||||
|
// Segment file has advanced since our last loop, so
|
||||||
|
// reset retry:
|
||||||
|
retry = false;
|
||||||
|
}
|
||||||
|
|
||||||
|
lastGen = gen;
|
||||||
|
|
||||||
|
segmentFileName = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||||
|
"",
|
||||||
|
gen);
|
||||||
|
|
||||||
|
try {
|
||||||
|
Object v = doBody(segmentFileName);
|
||||||
|
if (exc != null) {
|
||||||
|
message("success on " + segmentFileName);
|
||||||
|
}
|
||||||
|
return v;
|
||||||
|
} catch (IOException err) {
|
||||||
|
|
||||||
|
// Save the original root cause:
|
||||||
|
if (exc == null) {
|
||||||
|
exc = err;
|
||||||
|
}
|
||||||
|
|
||||||
|
message("primary Exception on '" + segmentFileName + "': " + err + "'; will retry: retry=" + retry + "; gen = " + gen);
|
||||||
|
|
||||||
|
if (!retry && gen > 1) {
|
||||||
|
|
||||||
|
// This is our first time trying this segments
|
||||||
|
// file (because retry is false), and, there is
|
||||||
|
// possibly a segments_(N-1) (because gen > 1).
|
||||||
|
// So, check if the segments_(N-1) exists and
|
||||||
|
// try it if so:
|
||||||
|
String prevSegmentFileName = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||||
|
"",
|
||||||
|
gen-1);
|
||||||
|
|
||||||
|
if (directory.fileExists(prevSegmentFileName)) {
|
||||||
|
message("fallback to prior segment file '" + prevSegmentFileName + "'");
|
||||||
|
try {
|
||||||
|
Object v = doBody(prevSegmentFileName);
|
||||||
|
if (exc != null) {
|
||||||
|
message("success on fallback " + prevSegmentFileName);
|
||||||
|
}
|
||||||
|
return v;
|
||||||
|
} catch (IOException err2) {
|
||||||
|
message("secondary Exception on '" + prevSegmentFileName + "': " + err2 + "'; will retry");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Subclass must implement this. The assumption is an
|
||||||
|
* IOException will be thrown if something goes wrong
|
||||||
|
* during the processing that could have been caused by
|
||||||
|
* a writer committing.
|
||||||
|
*/
|
||||||
|
protected abstract Object doBody(String segmentFileName) throws IOException;}
|
||||||
|
}
|
||||||
|
|
|
@ -33,6 +33,7 @@ import java.util.*;
|
||||||
*/
|
*/
|
||||||
class SegmentReader extends IndexReader {
|
class SegmentReader extends IndexReader {
|
||||||
private String segment;
|
private String segment;
|
||||||
|
private SegmentInfo si;
|
||||||
|
|
||||||
FieldInfos fieldInfos;
|
FieldInfos fieldInfos;
|
||||||
private FieldsReader fieldsReader;
|
private FieldsReader fieldsReader;
|
||||||
|
@ -64,22 +65,24 @@ class SegmentReader extends IndexReader {
|
||||||
private boolean dirty;
|
private boolean dirty;
|
||||||
private int number;
|
private int number;
|
||||||
|
|
||||||
private void reWrite() throws IOException {
|
private void reWrite(SegmentInfo si) throws IOException {
|
||||||
// NOTE: norms are re-written in regular directory, not cfs
|
// NOTE: norms are re-written in regular directory, not cfs
|
||||||
IndexOutput out = directory().createOutput(segment + ".tmp");
|
|
||||||
|
String oldFileName = si.getNormFileName(this.number);
|
||||||
|
if (oldFileName != null) {
|
||||||
|
// Mark this file for deletion. Note that we don't
|
||||||
|
// actually try to delete it until the new segments files is
|
||||||
|
// successfully written:
|
||||||
|
deleter.addPendingFile(oldFileName);
|
||||||
|
}
|
||||||
|
|
||||||
|
si.advanceNormGen(this.number);
|
||||||
|
IndexOutput out = directory().createOutput(si.getNormFileName(this.number));
|
||||||
try {
|
try {
|
||||||
out.writeBytes(bytes, maxDoc());
|
out.writeBytes(bytes, maxDoc());
|
||||||
} finally {
|
} finally {
|
||||||
out.close();
|
out.close();
|
||||||
}
|
}
|
||||||
String fileName;
|
|
||||||
if(cfsReader == null)
|
|
||||||
fileName = segment + ".f" + number;
|
|
||||||
else{
|
|
||||||
// use a different file name if we have compound format
|
|
||||||
fileName = segment + ".s" + number;
|
|
||||||
}
|
|
||||||
directory().renameFile(segment + ".tmp", fileName);
|
|
||||||
this.dirty = false;
|
this.dirty = false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -133,10 +136,14 @@ class SegmentReader extends IndexReader {
|
||||||
|
|
||||||
private void initialize(SegmentInfo si) throws IOException {
|
private void initialize(SegmentInfo si) throws IOException {
|
||||||
segment = si.name;
|
segment = si.name;
|
||||||
|
this.si = si;
|
||||||
|
|
||||||
|
boolean success = false;
|
||||||
|
|
||||||
|
try {
|
||||||
// Use compound file directory for some files, if it exists
|
// Use compound file directory for some files, if it exists
|
||||||
Directory cfsDir = directory();
|
Directory cfsDir = directory();
|
||||||
if (directory().fileExists(segment + ".cfs")) {
|
if (si.getUseCompoundFile()) {
|
||||||
cfsReader = new CompoundFileReader(directory(), segment + ".cfs");
|
cfsReader = new CompoundFileReader(directory(), segment + ".cfs");
|
||||||
cfsDir = cfsReader;
|
cfsDir = cfsReader;
|
||||||
}
|
}
|
||||||
|
@ -148,8 +155,9 @@ class SegmentReader extends IndexReader {
|
||||||
tis = new TermInfosReader(cfsDir, segment, fieldInfos);
|
tis = new TermInfosReader(cfsDir, segment, fieldInfos);
|
||||||
|
|
||||||
// NOTE: the bitvector is stored using the regular directory, not cfs
|
// NOTE: the bitvector is stored using the regular directory, not cfs
|
||||||
if (hasDeletions(si))
|
if (hasDeletions(si)) {
|
||||||
deletedDocs = new BitVector(directory(), segment + ".del");
|
deletedDocs = new BitVector(directory(), si.getDelFileName());
|
||||||
|
}
|
||||||
|
|
||||||
// make sure that all index files have been read or are kept open
|
// make sure that all index files have been read or are kept open
|
||||||
// so that if an index update removes them we'll still have them
|
// so that if an index update removes them we'll still have them
|
||||||
|
@ -160,6 +168,18 @@ class SegmentReader extends IndexReader {
|
||||||
if (fieldInfos.hasVectors()) { // open term vector files only as needed
|
if (fieldInfos.hasVectors()) { // open term vector files only as needed
|
||||||
termVectorsReaderOrig = new TermVectorsReader(cfsDir, segment, fieldInfos);
|
termVectorsReaderOrig = new TermVectorsReader(cfsDir, segment, fieldInfos);
|
||||||
}
|
}
|
||||||
|
success = true;
|
||||||
|
} finally {
|
||||||
|
|
||||||
|
// With lock-less commits, it's entirely possible (and
|
||||||
|
// fine) to hit a FileNotFound exception above. In
|
||||||
|
// this case, we want to explicitly close any subset
|
||||||
|
// of things that were opened so that we don't have to
|
||||||
|
// wait for a GC to do so.
|
||||||
|
if (!success) {
|
||||||
|
doClose();
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
protected void finalize() {
|
protected void finalize() {
|
||||||
|
@ -170,18 +190,38 @@ class SegmentReader extends IndexReader {
|
||||||
|
|
||||||
protected void doCommit() throws IOException {
|
protected void doCommit() throws IOException {
|
||||||
if (deletedDocsDirty) { // re-write deleted
|
if (deletedDocsDirty) { // re-write deleted
|
||||||
deletedDocs.write(directory(), segment + ".tmp");
|
String oldDelFileName = si.getDelFileName();
|
||||||
directory().renameFile(segment + ".tmp", segment + ".del");
|
if (oldDelFileName != null) {
|
||||||
|
// Mark this file for deletion. Note that we don't
|
||||||
|
// actually try to delete it until the new segments files is
|
||||||
|
// successfully written:
|
||||||
|
deleter.addPendingFile(oldDelFileName);
|
||||||
}
|
}
|
||||||
if(undeleteAll && directory().fileExists(segment + ".del")){
|
|
||||||
directory().deleteFile(segment + ".del");
|
si.advanceDelGen();
|
||||||
|
|
||||||
|
// We can write directly to the actual name (vs to a
|
||||||
|
// .tmp & renaming it) because the file is not live
|
||||||
|
// until segments file is written:
|
||||||
|
deletedDocs.write(directory(), si.getDelFileName());
|
||||||
|
}
|
||||||
|
if (undeleteAll && si.hasDeletions()) {
|
||||||
|
String oldDelFileName = si.getDelFileName();
|
||||||
|
if (oldDelFileName != null) {
|
||||||
|
// Mark this file for deletion. Note that we don't
|
||||||
|
// actually try to delete it until the new segments files is
|
||||||
|
// successfully written:
|
||||||
|
deleter.addPendingFile(oldDelFileName);
|
||||||
|
}
|
||||||
|
si.clearDelGen();
|
||||||
}
|
}
|
||||||
if (normsDirty) { // re-write norms
|
if (normsDirty) { // re-write norms
|
||||||
|
si.setNumField(fieldInfos.size());
|
||||||
Enumeration values = norms.elements();
|
Enumeration values = norms.elements();
|
||||||
while (values.hasMoreElements()) {
|
while (values.hasMoreElements()) {
|
||||||
Norm norm = (Norm) values.nextElement();
|
Norm norm = (Norm) values.nextElement();
|
||||||
if (norm.dirty) {
|
if (norm.dirty) {
|
||||||
norm.reWrite();
|
norm.reWrite(si);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -191,8 +231,12 @@ class SegmentReader extends IndexReader {
|
||||||
}
|
}
|
||||||
|
|
||||||
protected void doClose() throws IOException {
|
protected void doClose() throws IOException {
|
||||||
|
if (fieldsReader != null) {
|
||||||
fieldsReader.close();
|
fieldsReader.close();
|
||||||
|
}
|
||||||
|
if (tis != null) {
|
||||||
tis.close();
|
tis.close();
|
||||||
|
}
|
||||||
|
|
||||||
if (freqStream != null)
|
if (freqStream != null)
|
||||||
freqStream.close();
|
freqStream.close();
|
||||||
|
@ -209,27 +253,19 @@ class SegmentReader extends IndexReader {
|
||||||
}
|
}
|
||||||
|
|
||||||
static boolean hasDeletions(SegmentInfo si) throws IOException {
|
static boolean hasDeletions(SegmentInfo si) throws IOException {
|
||||||
return si.dir.fileExists(si.name + ".del");
|
return si.hasDeletions();
|
||||||
}
|
}
|
||||||
|
|
||||||
public boolean hasDeletions() {
|
public boolean hasDeletions() {
|
||||||
return deletedDocs != null;
|
return deletedDocs != null;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
static boolean usesCompoundFile(SegmentInfo si) throws IOException {
|
static boolean usesCompoundFile(SegmentInfo si) throws IOException {
|
||||||
return si.dir.fileExists(si.name + ".cfs");
|
return si.getUseCompoundFile();
|
||||||
}
|
}
|
||||||
|
|
||||||
static boolean hasSeparateNorms(SegmentInfo si) throws IOException {
|
static boolean hasSeparateNorms(SegmentInfo si) throws IOException {
|
||||||
String[] result = si.dir.list();
|
return si.hasSeparateNorms();
|
||||||
String pattern = si.name + ".s";
|
|
||||||
int patternLength = pattern.length();
|
|
||||||
for(int i = 0; i < result.length; i++){
|
|
||||||
if(result[i].startsWith(pattern) && Character.isDigit(result[i].charAt(patternLength)))
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
return false;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
protected void doDelete(int docNum) {
|
protected void doDelete(int docNum) {
|
||||||
|
@ -249,24 +285,28 @@ class SegmentReader extends IndexReader {
|
||||||
Vector files() throws IOException {
|
Vector files() throws IOException {
|
||||||
Vector files = new Vector(16);
|
Vector files = new Vector(16);
|
||||||
|
|
||||||
|
if (si.getUseCompoundFile()) {
|
||||||
|
String name = segment + ".cfs";
|
||||||
|
if (directory().fileExists(name)) {
|
||||||
|
files.addElement(name);
|
||||||
|
}
|
||||||
|
} else {
|
||||||
for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) {
|
for (int i = 0; i < IndexFileNames.INDEX_EXTENSIONS.length; i++) {
|
||||||
String name = segment + "." + IndexFileNames.INDEX_EXTENSIONS[i];
|
String name = segment + "." + IndexFileNames.INDEX_EXTENSIONS[i];
|
||||||
if (directory().fileExists(name))
|
if (directory().fileExists(name))
|
||||||
files.addElement(name);
|
files.addElement(name);
|
||||||
}
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (si.hasDeletions()) {
|
||||||
|
files.addElement(si.getDelFileName());
|
||||||
|
}
|
||||||
|
|
||||||
for (int i = 0; i < fieldInfos.size(); i++) {
|
for (int i = 0; i < fieldInfos.size(); i++) {
|
||||||
FieldInfo fi = fieldInfos.fieldInfo(i);
|
String name = si.getNormFileName(i);
|
||||||
if (fi.isIndexed && !fi.omitNorms){
|
if (name != null && directory().fileExists(name))
|
||||||
String name;
|
|
||||||
if(cfsReader == null)
|
|
||||||
name = segment + ".f" + i;
|
|
||||||
else
|
|
||||||
name = segment + ".s" + i;
|
|
||||||
if (directory().fileExists(name))
|
|
||||||
files.addElement(name);
|
files.addElement(name);
|
||||||
}
|
}
|
||||||
}
|
|
||||||
return files;
|
return files;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -380,7 +420,6 @@ class SegmentReader extends IndexReader {
|
||||||
protected synchronized byte[] getNorms(String field) throws IOException {
|
protected synchronized byte[] getNorms(String field) throws IOException {
|
||||||
Norm norm = (Norm) norms.get(field);
|
Norm norm = (Norm) norms.get(field);
|
||||||
if (norm == null) return null; // not indexed, or norms not stored
|
if (norm == null) return null; // not indexed, or norms not stored
|
||||||
|
|
||||||
if (norm.bytes == null) { // value not yet read
|
if (norm.bytes == null) { // value not yet read
|
||||||
byte[] bytes = new byte[maxDoc()];
|
byte[] bytes = new byte[maxDoc()];
|
||||||
norms(field, bytes, 0);
|
norms(field, bytes, 0);
|
||||||
|
@ -436,11 +475,9 @@ class SegmentReader extends IndexReader {
|
||||||
for (int i = 0; i < fieldInfos.size(); i++) {
|
for (int i = 0; i < fieldInfos.size(); i++) {
|
||||||
FieldInfo fi = fieldInfos.fieldInfo(i);
|
FieldInfo fi = fieldInfos.fieldInfo(i);
|
||||||
if (fi.isIndexed && !fi.omitNorms) {
|
if (fi.isIndexed && !fi.omitNorms) {
|
||||||
// look first if there are separate norms in compound format
|
|
||||||
String fileName = segment + ".s" + fi.number;
|
|
||||||
Directory d = directory();
|
Directory d = directory();
|
||||||
if(!d.fileExists(fileName)){
|
String fileName = si.getNormFileName(fi.number);
|
||||||
fileName = segment + ".f" + fi.number;
|
if (!si.hasSeparateNorms(fi.number)) {
|
||||||
d = cfsDir;
|
d = cfsDir;
|
||||||
}
|
}
|
||||||
norms.put(fi.name, new Norm(d.openInput(fileName), fi.number));
|
norms.put(fi.name, new Norm(d.openInput(fileName), fi.number));
|
||||||
|
|
|
@ -128,7 +128,7 @@ public class FSDirectory extends Directory {
|
||||||
* @return the FSDirectory for the named file. */
|
* @return the FSDirectory for the named file. */
|
||||||
public static FSDirectory getDirectory(String path, boolean create)
|
public static FSDirectory getDirectory(String path, boolean create)
|
||||||
throws IOException {
|
throws IOException {
|
||||||
return getDirectory(path, create, null);
|
return getDirectory(new File(path), create, null, true);
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Returns the directory instance for the named location, using the
|
/** Returns the directory instance for the named location, using the
|
||||||
|
@ -143,10 +143,16 @@ public class FSDirectory extends Directory {
|
||||||
* @param lockFactory instance of {@link LockFactory} providing the
|
* @param lockFactory instance of {@link LockFactory} providing the
|
||||||
* locking implementation.
|
* locking implementation.
|
||||||
* @return the FSDirectory for the named file. */
|
* @return the FSDirectory for the named file. */
|
||||||
|
public static FSDirectory getDirectory(String path, boolean create,
|
||||||
|
LockFactory lockFactory, boolean doRemoveOldFiles)
|
||||||
|
throws IOException {
|
||||||
|
return getDirectory(new File(path), create, lockFactory, doRemoveOldFiles);
|
||||||
|
}
|
||||||
|
|
||||||
public static FSDirectory getDirectory(String path, boolean create,
|
public static FSDirectory getDirectory(String path, boolean create,
|
||||||
LockFactory lockFactory)
|
LockFactory lockFactory)
|
||||||
throws IOException {
|
throws IOException {
|
||||||
return getDirectory(new File(path), create, lockFactory);
|
return getDirectory(new File(path), create, lockFactory, true);
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Returns the directory instance for the named location.
|
/** Returns the directory instance for the named location.
|
||||||
|
@ -158,9 +164,9 @@ public class FSDirectory extends Directory {
|
||||||
* @param file the path to the directory.
|
* @param file the path to the directory.
|
||||||
* @param create if true, create, or erase any existing contents.
|
* @param create if true, create, or erase any existing contents.
|
||||||
* @return the FSDirectory for the named file. */
|
* @return the FSDirectory for the named file. */
|
||||||
public static FSDirectory getDirectory(File file, boolean create)
|
public static FSDirectory getDirectory(File file, boolean create, boolean doRemoveOldFiles)
|
||||||
throws IOException {
|
throws IOException {
|
||||||
return getDirectory(file, create, null);
|
return getDirectory(file, create, null, doRemoveOldFiles);
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Returns the directory instance for the named location, using the
|
/** Returns the directory instance for the named location, using the
|
||||||
|
@ -176,7 +182,7 @@ public class FSDirectory extends Directory {
|
||||||
* locking implementation.
|
* locking implementation.
|
||||||
* @return the FSDirectory for the named file. */
|
* @return the FSDirectory for the named file. */
|
||||||
public static FSDirectory getDirectory(File file, boolean create,
|
public static FSDirectory getDirectory(File file, boolean create,
|
||||||
LockFactory lockFactory)
|
LockFactory lockFactory, boolean doRemoveOldFiles)
|
||||||
throws IOException {
|
throws IOException {
|
||||||
file = new File(file.getCanonicalPath());
|
file = new File(file.getCanonicalPath());
|
||||||
FSDirectory dir;
|
FSDirectory dir;
|
||||||
|
@ -188,7 +194,7 @@ public class FSDirectory extends Directory {
|
||||||
} catch (Exception e) {
|
} catch (Exception e) {
|
||||||
throw new RuntimeException("cannot load FSDirectory class: " + e.toString(), e);
|
throw new RuntimeException("cannot load FSDirectory class: " + e.toString(), e);
|
||||||
}
|
}
|
||||||
dir.init(file, create, lockFactory);
|
dir.init(file, create, lockFactory, doRemoveOldFiles);
|
||||||
DIRECTORIES.put(file, dir);
|
DIRECTORIES.put(file, dir);
|
||||||
} else {
|
} else {
|
||||||
|
|
||||||
|
@ -199,7 +205,7 @@ public class FSDirectory extends Directory {
|
||||||
}
|
}
|
||||||
|
|
||||||
if (create) {
|
if (create) {
|
||||||
dir.create();
|
dir.create(doRemoveOldFiles);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -209,23 +215,35 @@ public class FSDirectory extends Directory {
|
||||||
return dir;
|
return dir;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
public static FSDirectory getDirectory(File file, boolean create,
|
||||||
|
LockFactory lockFactory)
|
||||||
|
throws IOException
|
||||||
|
{
|
||||||
|
return getDirectory(file, create, lockFactory, true);
|
||||||
|
}
|
||||||
|
|
||||||
|
public static FSDirectory getDirectory(File file, boolean create)
|
||||||
|
throws IOException {
|
||||||
|
return getDirectory(file, create, true);
|
||||||
|
}
|
||||||
|
|
||||||
private File directory = null;
|
private File directory = null;
|
||||||
private int refCount;
|
private int refCount;
|
||||||
|
|
||||||
protected FSDirectory() {}; // permit subclassing
|
protected FSDirectory() {}; // permit subclassing
|
||||||
|
|
||||||
private void init(File path, boolean create) throws IOException {
|
private void init(File path, boolean create, boolean doRemoveOldFiles) throws IOException {
|
||||||
directory = path;
|
directory = path;
|
||||||
|
|
||||||
if (create) {
|
if (create) {
|
||||||
create();
|
create(doRemoveOldFiles);
|
||||||
}
|
}
|
||||||
|
|
||||||
if (!directory.isDirectory())
|
if (!directory.isDirectory())
|
||||||
throw new IOException(path + " not a directory");
|
throw new IOException(path + " not a directory");
|
||||||
}
|
}
|
||||||
|
|
||||||
private void init(File path, boolean create, LockFactory lockFactory) throws IOException {
|
private void init(File path, boolean create, LockFactory lockFactory, boolean doRemoveOldFiles) throws IOException {
|
||||||
|
|
||||||
// Set up lockFactory with cascaded defaults: if an instance was passed in,
|
// Set up lockFactory with cascaded defaults: if an instance was passed in,
|
||||||
// use that; else if locks are disabled, use NoLockFactory; else if the
|
// use that; else if locks are disabled, use NoLockFactory; else if the
|
||||||
|
@ -280,10 +298,10 @@ public class FSDirectory extends Directory {
|
||||||
|
|
||||||
setLockFactory(lockFactory);
|
setLockFactory(lockFactory);
|
||||||
|
|
||||||
init(path, create);
|
init(path, create, doRemoveOldFiles);
|
||||||
}
|
}
|
||||||
|
|
||||||
private synchronized void create() throws IOException {
|
private synchronized void create(boolean doRemoveOldFiles) throws IOException {
|
||||||
if (!directory.exists())
|
if (!directory.exists())
|
||||||
if (!directory.mkdirs())
|
if (!directory.mkdirs())
|
||||||
throw new IOException("Cannot create directory: " + directory);
|
throw new IOException("Cannot create directory: " + directory);
|
||||||
|
@ -291,7 +309,8 @@ public class FSDirectory extends Directory {
|
||||||
if (!directory.isDirectory())
|
if (!directory.isDirectory())
|
||||||
throw new IOException(directory + " not a directory");
|
throw new IOException(directory + " not a directory");
|
||||||
|
|
||||||
String[] files = directory.list(new IndexFileNameFilter()); // clear old files
|
if (doRemoveOldFiles) {
|
||||||
|
String[] files = directory.list(IndexFileNameFilter.getFilter()); // clear old files
|
||||||
if (files == null)
|
if (files == null)
|
||||||
throw new IOException("Cannot read directory " + directory.getAbsolutePath());
|
throw new IOException("Cannot read directory " + directory.getAbsolutePath());
|
||||||
for (int i = 0; i < files.length; i++) {
|
for (int i = 0; i < files.length; i++) {
|
||||||
|
@ -299,13 +318,14 @@ public class FSDirectory extends Directory {
|
||||||
if (!file.delete())
|
if (!file.delete())
|
||||||
throw new IOException("Cannot delete " + file);
|
throw new IOException("Cannot delete " + file);
|
||||||
}
|
}
|
||||||
|
}
|
||||||
|
|
||||||
lockFactory.clearAllLocks();
|
lockFactory.clearAllLocks();
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Returns an array of strings, one for each Lucene index file in the directory. */
|
/** Returns an array of strings, one for each Lucene index file in the directory. */
|
||||||
public String[] list() {
|
public String[] list() {
|
||||||
return directory.list(new IndexFileNameFilter());
|
return directory.list(IndexFileNameFilter.getFilter());
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Returns true iff a file with the given name exists. */
|
/** Returns true iff a file with the given name exists. */
|
||||||
|
|
|
@ -18,6 +18,7 @@ package org.apache.lucene.store;
|
||||||
*/
|
*/
|
||||||
|
|
||||||
import java.io.IOException;
|
import java.io.IOException;
|
||||||
|
import java.io.FileNotFoundException;
|
||||||
import java.io.File;
|
import java.io.File;
|
||||||
import java.io.Serializable;
|
import java.io.Serializable;
|
||||||
import java.util.Hashtable;
|
import java.util.Hashtable;
|
||||||
|
@ -105,7 +106,7 @@ public final class RAMDirectory extends Directory implements Serializable {
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Returns an array of strings, one for each file in the directory. */
|
/** Returns an array of strings, one for each file in the directory. */
|
||||||
public final String[] list() {
|
public synchronized final String[] list() {
|
||||||
String[] result = new String[files.size()];
|
String[] result = new String[files.size()];
|
||||||
int i = 0;
|
int i = 0;
|
||||||
Enumeration names = files.keys();
|
Enumeration names = files.keys();
|
||||||
|
@ -175,8 +176,11 @@ public final class RAMDirectory extends Directory implements Serializable {
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Returns a stream reading an existing file. */
|
/** Returns a stream reading an existing file. */
|
||||||
public final IndexInput openInput(String name) {
|
public final IndexInput openInput(String name) throws IOException {
|
||||||
RAMFile file = (RAMFile)files.get(name);
|
RAMFile file = (RAMFile)files.get(name);
|
||||||
|
if (file == null) {
|
||||||
|
throw new FileNotFoundException(name);
|
||||||
|
}
|
||||||
return new RAMInputStream(file);
|
return new RAMInputStream(file);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -32,6 +32,7 @@ import org.apache.lucene.document.Field;
|
||||||
|
|
||||||
import java.util.Collection;
|
import java.util.Collection;
|
||||||
import java.io.IOException;
|
import java.io.IOException;
|
||||||
|
import java.io.FileNotFoundException;
|
||||||
import java.io.File;
|
import java.io.File;
|
||||||
|
|
||||||
public class TestIndexReader extends TestCase
|
public class TestIndexReader extends TestCase
|
||||||
|
@ -222,6 +223,11 @@ public class TestIndexReader extends TestCase
|
||||||
assertEquals("deleted count", 100, deleted);
|
assertEquals("deleted count", 100, deleted);
|
||||||
assertEquals("deleted docFreq", 100, reader.docFreq(searchTerm));
|
assertEquals("deleted docFreq", 100, reader.docFreq(searchTerm));
|
||||||
assertTermDocsCount("deleted termDocs", reader, searchTerm, 0);
|
assertTermDocsCount("deleted termDocs", reader, searchTerm, 0);
|
||||||
|
|
||||||
|
// open a 2nd reader to make sure first reader can
|
||||||
|
// commit its changes (.del) while second reader
|
||||||
|
// is open:
|
||||||
|
IndexReader reader2 = IndexReader.open(dir);
|
||||||
reader.close();
|
reader.close();
|
||||||
|
|
||||||
// CREATE A NEW READER and re-test
|
// CREATE A NEW READER and re-test
|
||||||
|
@ -231,11 +237,74 @@ public class TestIndexReader extends TestCase
|
||||||
reader.close();
|
reader.close();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Make sure you can set norms & commit even if a reader
|
||||||
|
// is open against the index:
|
||||||
|
public void testWritingNorms() throws IOException
|
||||||
|
{
|
||||||
|
String tempDir = System.getProperty("tempDir");
|
||||||
|
if (tempDir == null)
|
||||||
|
throw new IOException("tempDir undefined, cannot run test");
|
||||||
|
|
||||||
|
File indexDir = new File(tempDir, "lucenetestnormwriter");
|
||||||
|
Directory dir = FSDirectory.getDirectory(indexDir, true);
|
||||||
|
IndexWriter writer = null;
|
||||||
|
IndexReader reader = null;
|
||||||
|
Term searchTerm = new Term("content", "aaa");
|
||||||
|
|
||||||
|
// add 1 documents with term : aaa
|
||||||
|
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
|
addDoc(writer, searchTerm.text());
|
||||||
|
writer.close();
|
||||||
|
|
||||||
|
// now open reader & set norm for doc 0
|
||||||
|
reader = IndexReader.open(dir);
|
||||||
|
reader.setNorm(0, "content", (float) 2.0);
|
||||||
|
|
||||||
|
// we should be holding the write lock now:
|
||||||
|
assertTrue("locked", IndexReader.isLocked(dir));
|
||||||
|
|
||||||
|
reader.commit();
|
||||||
|
|
||||||
|
// we should not be holding the write lock now:
|
||||||
|
assertTrue("not locked", !IndexReader.isLocked(dir));
|
||||||
|
|
||||||
|
// open a 2nd reader:
|
||||||
|
IndexReader reader2 = IndexReader.open(dir);
|
||||||
|
|
||||||
|
// set norm again for doc 0
|
||||||
|
reader.setNorm(0, "content", (float) 3.0);
|
||||||
|
assertTrue("locked", IndexReader.isLocked(dir));
|
||||||
|
|
||||||
|
reader.close();
|
||||||
|
|
||||||
|
// we should not be holding the write lock now:
|
||||||
|
assertTrue("not locked", !IndexReader.isLocked(dir));
|
||||||
|
|
||||||
|
reader2.close();
|
||||||
|
dir.close();
|
||||||
|
|
||||||
|
rmDir(indexDir);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
public void testDeleteReaderWriterConflictUnoptimized() throws IOException{
|
public void testDeleteReaderWriterConflictUnoptimized() throws IOException{
|
||||||
deleteReaderWriterConflict(false);
|
deleteReaderWriterConflict(false);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
public void testOpenEmptyDirectory() throws IOException{
|
||||||
|
String dirName = "test.empty";
|
||||||
|
File fileDirName = new File(dirName);
|
||||||
|
if (!fileDirName.exists()) {
|
||||||
|
fileDirName.mkdir();
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
IndexReader reader = IndexReader.open(fileDirName);
|
||||||
|
fail("opening IndexReader on empty directory failed to produce FileNotFoundException");
|
||||||
|
} catch (FileNotFoundException e) {
|
||||||
|
// GOOD
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
public void testDeleteReaderWriterConflictOptimized() throws IOException{
|
public void testDeleteReaderWriterConflictOptimized() throws IOException{
|
||||||
deleteReaderWriterConflict(true);
|
deleteReaderWriterConflict(true);
|
||||||
}
|
}
|
||||||
|
@ -368,12 +437,36 @@ public class TestIndexReader extends TestCase
|
||||||
assertFalse(IndexReader.isLocked(dir)); // reader only, no lock
|
assertFalse(IndexReader.isLocked(dir)); // reader only, no lock
|
||||||
long version = IndexReader.lastModified(dir);
|
long version = IndexReader.lastModified(dir);
|
||||||
reader.close();
|
reader.close();
|
||||||
// modify index and check version has been incremented:
|
// modify index and check version has been
|
||||||
|
// incremented:
|
||||||
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
addDocumentWithFields(writer);
|
addDocumentWithFields(writer);
|
||||||
writer.close();
|
writer.close();
|
||||||
reader = IndexReader.open(dir);
|
reader = IndexReader.open(dir);
|
||||||
assertTrue(version < IndexReader.getCurrentVersion(dir));
|
assertTrue("old lastModified is " + version + "; new lastModified is " + IndexReader.lastModified(dir), version <= IndexReader.lastModified(dir));
|
||||||
|
reader.close();
|
||||||
|
}
|
||||||
|
|
||||||
|
public void testVersion() throws IOException {
|
||||||
|
assertFalse(IndexReader.indexExists("there_is_no_such_index"));
|
||||||
|
Directory dir = new RAMDirectory();
|
||||||
|
assertFalse(IndexReader.indexExists(dir));
|
||||||
|
IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
|
addDocumentWithFields(writer);
|
||||||
|
assertTrue(IndexReader.isLocked(dir)); // writer open, so dir is locked
|
||||||
|
writer.close();
|
||||||
|
assertTrue(IndexReader.indexExists(dir));
|
||||||
|
IndexReader reader = IndexReader.open(dir);
|
||||||
|
assertFalse(IndexReader.isLocked(dir)); // reader only, no lock
|
||||||
|
long version = IndexReader.getCurrentVersion(dir);
|
||||||
|
reader.close();
|
||||||
|
// modify index and check version has been
|
||||||
|
// incremented:
|
||||||
|
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
|
addDocumentWithFields(writer);
|
||||||
|
writer.close();
|
||||||
|
reader = IndexReader.open(dir);
|
||||||
|
assertTrue("old version is " + version + "; new version is " + IndexReader.getCurrentVersion(dir), version < IndexReader.getCurrentVersion(dir));
|
||||||
reader.close();
|
reader.close();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -412,6 +505,40 @@ public class TestIndexReader extends TestCase
|
||||||
reader.close();
|
reader.close();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
public void testUndeleteAllAfterClose() throws IOException {
|
||||||
|
Directory dir = new RAMDirectory();
|
||||||
|
IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
|
addDocumentWithFields(writer);
|
||||||
|
addDocumentWithFields(writer);
|
||||||
|
writer.close();
|
||||||
|
IndexReader reader = IndexReader.open(dir);
|
||||||
|
reader.deleteDocument(0);
|
||||||
|
reader.deleteDocument(1);
|
||||||
|
reader.close();
|
||||||
|
reader = IndexReader.open(dir);
|
||||||
|
reader.undeleteAll();
|
||||||
|
assertEquals(2, reader.numDocs()); // nothing has really been deleted thanks to undeleteAll()
|
||||||
|
reader.close();
|
||||||
|
}
|
||||||
|
|
||||||
|
public void testUndeleteAllAfterCloseThenReopen() throws IOException {
|
||||||
|
Directory dir = new RAMDirectory();
|
||||||
|
IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
|
addDocumentWithFields(writer);
|
||||||
|
addDocumentWithFields(writer);
|
||||||
|
writer.close();
|
||||||
|
IndexReader reader = IndexReader.open(dir);
|
||||||
|
reader.deleteDocument(0);
|
||||||
|
reader.deleteDocument(1);
|
||||||
|
reader.close();
|
||||||
|
reader = IndexReader.open(dir);
|
||||||
|
reader.undeleteAll();
|
||||||
|
reader.close();
|
||||||
|
reader = IndexReader.open(dir);
|
||||||
|
assertEquals(2, reader.numDocs()); // nothing has really been deleted thanks to undeleteAll()
|
||||||
|
reader.close();
|
||||||
|
}
|
||||||
|
|
||||||
public void testDeleteReaderReaderConflictUnoptimized() throws IOException{
|
public void testDeleteReaderReaderConflictUnoptimized() throws IOException{
|
||||||
deleteReaderReaderConflict(false);
|
deleteReaderReaderConflict(false);
|
||||||
}
|
}
|
||||||
|
@ -562,4 +689,11 @@ public class TestIndexReader extends TestCase
|
||||||
doc.add(new Field("content", value, Field.Store.NO, Field.Index.TOKENIZED));
|
doc.add(new Field("content", value, Field.Store.NO, Field.Index.TOKENIZED));
|
||||||
writer.addDocument(doc);
|
writer.addDocument(doc);
|
||||||
}
|
}
|
||||||
|
private void rmDir(File dir) {
|
||||||
|
File[] files = dir.listFiles();
|
||||||
|
for (int i = 0; i < files.length; i++) {
|
||||||
|
files[i].delete();
|
||||||
|
}
|
||||||
|
dir.delete();
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
package org.apache.lucene.index;
|
package org.apache.lucene.index;
|
||||||
|
|
||||||
import java.io.IOException;
|
import java.io.IOException;
|
||||||
|
import java.io.File;
|
||||||
|
|
||||||
import junit.framework.TestCase;
|
import junit.framework.TestCase;
|
||||||
|
|
||||||
|
@ -10,7 +11,10 @@ import org.apache.lucene.document.Field;
|
||||||
import org.apache.lucene.index.IndexReader;
|
import org.apache.lucene.index.IndexReader;
|
||||||
import org.apache.lucene.index.IndexWriter;
|
import org.apache.lucene.index.IndexWriter;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
|
import org.apache.lucene.store.FSDirectory;
|
||||||
import org.apache.lucene.store.RAMDirectory;
|
import org.apache.lucene.store.RAMDirectory;
|
||||||
|
import org.apache.lucene.store.IndexInput;
|
||||||
|
import org.apache.lucene.store.IndexOutput;
|
||||||
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -28,14 +32,11 @@ public class TestIndexWriter extends TestCase
|
||||||
int i;
|
int i;
|
||||||
|
|
||||||
IndexWriter.setDefaultWriteLockTimeout(2000);
|
IndexWriter.setDefaultWriteLockTimeout(2000);
|
||||||
IndexWriter.setDefaultCommitLockTimeout(2000);
|
|
||||||
assertEquals(2000, IndexWriter.getDefaultWriteLockTimeout());
|
assertEquals(2000, IndexWriter.getDefaultWriteLockTimeout());
|
||||||
assertEquals(2000, IndexWriter.getDefaultCommitLockTimeout());
|
|
||||||
|
|
||||||
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
|
|
||||||
IndexWriter.setDefaultWriteLockTimeout(1000);
|
IndexWriter.setDefaultWriteLockTimeout(1000);
|
||||||
IndexWriter.setDefaultCommitLockTimeout(1000);
|
|
||||||
|
|
||||||
// add 100 documents
|
// add 100 documents
|
||||||
for (i = 0; i < 100; i++) {
|
for (i = 0; i < 100; i++) {
|
||||||
|
@ -72,6 +73,12 @@ public class TestIndexWriter extends TestCase
|
||||||
assertEquals(60, reader.maxDoc());
|
assertEquals(60, reader.maxDoc());
|
||||||
assertEquals(60, reader.numDocs());
|
assertEquals(60, reader.numDocs());
|
||||||
reader.close();
|
reader.close();
|
||||||
|
|
||||||
|
// make sure opening a new index for create over
|
||||||
|
// this existing one works correctly:
|
||||||
|
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
|
assertEquals(0, writer.docCount());
|
||||||
|
writer.close();
|
||||||
}
|
}
|
||||||
|
|
||||||
private void addDoc(IndexWriter writer) throws IOException
|
private void addDoc(IndexWriter writer) throws IOException
|
||||||
|
@ -80,4 +87,192 @@ public class TestIndexWriter extends TestCase
|
||||||
doc.add(new Field("content", "aaa", Field.Store.NO, Field.Index.TOKENIZED));
|
doc.add(new Field("content", "aaa", Field.Store.NO, Field.Index.TOKENIZED));
|
||||||
writer.addDocument(doc);
|
writer.addDocument(doc);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Make sure we can open an index for create even when a
|
||||||
|
// reader holds it open (this fails pre lock-less
|
||||||
|
// commits on windows):
|
||||||
|
public void testCreateWithReader() throws IOException {
|
||||||
|
String tempDir = System.getProperty("java.io.tmpdir");
|
||||||
|
if (tempDir == null)
|
||||||
|
throw new IOException("java.io.tmpdir undefined, cannot run test");
|
||||||
|
File indexDir = new File(tempDir, "lucenetestindexwriter");
|
||||||
|
Directory dir = FSDirectory.getDirectory(indexDir, true);
|
||||||
|
|
||||||
|
// add one document & close writer
|
||||||
|
IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
|
addDoc(writer);
|
||||||
|
writer.close();
|
||||||
|
|
||||||
|
// now open reader:
|
||||||
|
IndexReader reader = IndexReader.open(dir);
|
||||||
|
assertEquals("should be one document", reader.numDocs(), 1);
|
||||||
|
|
||||||
|
// now open index for create:
|
||||||
|
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
|
assertEquals("should be zero documents", writer.docCount(), 0);
|
||||||
|
addDoc(writer);
|
||||||
|
writer.close();
|
||||||
|
|
||||||
|
assertEquals("should be one document", reader.numDocs(), 1);
|
||||||
|
IndexReader reader2 = IndexReader.open(dir);
|
||||||
|
assertEquals("should be one document", reader2.numDocs(), 1);
|
||||||
|
reader.close();
|
||||||
|
reader2.close();
|
||||||
|
rmDir(indexDir);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Simulate a writer that crashed while writing segments
|
||||||
|
// file: make sure we can still open the index (ie,
|
||||||
|
// gracefully fallback to the previous segments file),
|
||||||
|
// and that we can add to the index:
|
||||||
|
public void testSimulatedCrashedWriter() throws IOException {
|
||||||
|
Directory dir = new RAMDirectory();
|
||||||
|
|
||||||
|
IndexWriter writer = null;
|
||||||
|
|
||||||
|
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
|
|
||||||
|
// add 100 documents
|
||||||
|
for (int i = 0; i < 100; i++) {
|
||||||
|
addDoc(writer);
|
||||||
|
}
|
||||||
|
|
||||||
|
// close
|
||||||
|
writer.close();
|
||||||
|
|
||||||
|
long gen = SegmentInfos.getCurrentSegmentGeneration(dir);
|
||||||
|
assertTrue("segment generation should be > 1 but got " + gen, gen > 1);
|
||||||
|
|
||||||
|
// Make the next segments file, with last byte
|
||||||
|
// missing, to simulate a writer that crashed while
|
||||||
|
// writing segments file:
|
||||||
|
String fileNameIn = SegmentInfos.getCurrentSegmentFileName(dir);
|
||||||
|
String fileNameOut = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||||
|
"",
|
||||||
|
1+gen);
|
||||||
|
IndexInput in = dir.openInput(fileNameIn);
|
||||||
|
IndexOutput out = dir.createOutput(fileNameOut);
|
||||||
|
long length = in.length();
|
||||||
|
for(int i=0;i<length-1;i++) {
|
||||||
|
out.writeByte(in.readByte());
|
||||||
|
}
|
||||||
|
in.close();
|
||||||
|
out.close();
|
||||||
|
|
||||||
|
IndexReader reader = null;
|
||||||
|
try {
|
||||||
|
reader = IndexReader.open(dir);
|
||||||
|
} catch (Exception e) {
|
||||||
|
fail("reader failed to open on a crashed index");
|
||||||
|
}
|
||||||
|
reader.close();
|
||||||
|
|
||||||
|
try {
|
||||||
|
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
|
} catch (Exception e) {
|
||||||
|
fail("writer failed to open on a crashed index");
|
||||||
|
}
|
||||||
|
|
||||||
|
// add 100 documents
|
||||||
|
for (int i = 0; i < 100; i++) {
|
||||||
|
addDoc(writer);
|
||||||
|
}
|
||||||
|
|
||||||
|
// close
|
||||||
|
writer.close();
|
||||||
|
}
|
||||||
|
|
||||||
|
// Simulate a corrupt index by removing last byte of
|
||||||
|
// latest segments file and make sure we get an
|
||||||
|
// IOException trying to open the index:
|
||||||
|
public void testSimulatedCorruptIndex1() throws IOException {
|
||||||
|
Directory dir = new RAMDirectory();
|
||||||
|
|
||||||
|
IndexWriter writer = null;
|
||||||
|
|
||||||
|
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
|
|
||||||
|
// add 100 documents
|
||||||
|
for (int i = 0; i < 100; i++) {
|
||||||
|
addDoc(writer);
|
||||||
|
}
|
||||||
|
|
||||||
|
// close
|
||||||
|
writer.close();
|
||||||
|
|
||||||
|
long gen = SegmentInfos.getCurrentSegmentGeneration(dir);
|
||||||
|
assertTrue("segment generation should be > 1 but got " + gen, gen > 1);
|
||||||
|
|
||||||
|
String fileNameIn = SegmentInfos.getCurrentSegmentFileName(dir);
|
||||||
|
String fileNameOut = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS,
|
||||||
|
"",
|
||||||
|
1+gen);
|
||||||
|
IndexInput in = dir.openInput(fileNameIn);
|
||||||
|
IndexOutput out = dir.createOutput(fileNameOut);
|
||||||
|
long length = in.length();
|
||||||
|
for(int i=0;i<length-1;i++) {
|
||||||
|
out.writeByte(in.readByte());
|
||||||
|
}
|
||||||
|
in.close();
|
||||||
|
out.close();
|
||||||
|
dir.deleteFile(fileNameIn);
|
||||||
|
|
||||||
|
IndexReader reader = null;
|
||||||
|
try {
|
||||||
|
reader = IndexReader.open(dir);
|
||||||
|
fail("reader did not hit IOException on opening a corrupt index");
|
||||||
|
} catch (Exception e) {
|
||||||
|
}
|
||||||
|
if (reader != null) {
|
||||||
|
reader.close();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Simulate a corrupt index by removing one of the cfs
|
||||||
|
// files and make sure we get an IOException trying to
|
||||||
|
// open the index:
|
||||||
|
public void testSimulatedCorruptIndex2() throws IOException {
|
||||||
|
Directory dir = new RAMDirectory();
|
||||||
|
|
||||||
|
IndexWriter writer = null;
|
||||||
|
|
||||||
|
writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true);
|
||||||
|
|
||||||
|
// add 100 documents
|
||||||
|
for (int i = 0; i < 100; i++) {
|
||||||
|
addDoc(writer);
|
||||||
|
}
|
||||||
|
|
||||||
|
// close
|
||||||
|
writer.close();
|
||||||
|
|
||||||
|
long gen = SegmentInfos.getCurrentSegmentGeneration(dir);
|
||||||
|
assertTrue("segment generation should be > 1 but got " + gen, gen > 1);
|
||||||
|
|
||||||
|
String[] files = dir.list();
|
||||||
|
for(int i=0;i<files.length;i++) {
|
||||||
|
if (files[i].endsWith(".cfs")) {
|
||||||
|
dir.deleteFile(files[i]);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
IndexReader reader = null;
|
||||||
|
try {
|
||||||
|
reader = IndexReader.open(dir);
|
||||||
|
fail("reader did not hit IOException on opening a corrupt index");
|
||||||
|
} catch (Exception e) {
|
||||||
|
}
|
||||||
|
if (reader != null) {
|
||||||
|
reader.close();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private void rmDir(File dir) {
|
||||||
|
File[] files = dir.listFiles();
|
||||||
|
for (int i = 0; i < files.length; i++) {
|
||||||
|
files[i].delete();
|
||||||
|
}
|
||||||
|
dir.delete();
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -80,6 +80,21 @@ public class TestMultiReader extends TestCase {
|
||||||
assertEquals( 1, reader.numDocs() );
|
assertEquals( 1, reader.numDocs() );
|
||||||
reader.undeleteAll();
|
reader.undeleteAll();
|
||||||
assertEquals( 2, reader.numDocs() );
|
assertEquals( 2, reader.numDocs() );
|
||||||
|
|
||||||
|
// Ensure undeleteAll survives commit/close/reopen:
|
||||||
|
reader.commit();
|
||||||
|
reader.close();
|
||||||
|
sis.read(dir);
|
||||||
|
reader = new MultiReader(dir, sis, false, readers);
|
||||||
|
assertEquals( 2, reader.numDocs() );
|
||||||
|
|
||||||
|
reader.deleteDocument(0);
|
||||||
|
assertEquals( 1, reader.numDocs() );
|
||||||
|
reader.commit();
|
||||||
|
reader.close();
|
||||||
|
sis.read(dir);
|
||||||
|
reader = new MultiReader(dir, sis, false, readers);
|
||||||
|
assertEquals( 1, reader.numDocs() );
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
Binary file not shown.
Binary file not shown.
|
@ -58,9 +58,9 @@ public class TestLockFactory extends TestCase {
|
||||||
|
|
||||||
// Both write lock and commit lock should have been created:
|
// Both write lock and commit lock should have been created:
|
||||||
assertEquals("# of unique locks created (after instantiating IndexWriter)",
|
assertEquals("# of unique locks created (after instantiating IndexWriter)",
|
||||||
2, lf.locksCreated.size());
|
1, lf.locksCreated.size());
|
||||||
assertTrue("# calls to makeLock <= 2 (after instantiating IndexWriter)",
|
assertTrue("# calls to makeLock is 0 (after instantiating IndexWriter)",
|
||||||
lf.makeLockCount > 2);
|
lf.makeLockCount >= 1);
|
||||||
|
|
||||||
for(Enumeration e = lf.locksCreated.keys(); e.hasMoreElements();) {
|
for(Enumeration e = lf.locksCreated.keys(); e.hasMoreElements();) {
|
||||||
String lockName = (String) e.nextElement();
|
String lockName = (String) e.nextElement();
|
||||||
|
@ -90,6 +90,7 @@ public class TestLockFactory extends TestCase {
|
||||||
try {
|
try {
|
||||||
writer2 = new IndexWriter(dir, new WhitespaceAnalyzer(), false);
|
writer2 = new IndexWriter(dir, new WhitespaceAnalyzer(), false);
|
||||||
} catch (Exception e) {
|
} catch (Exception e) {
|
||||||
|
e.printStackTrace(System.out);
|
||||||
fail("Should not have hit an IOException with no locking");
|
fail("Should not have hit an IOException with no locking");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -234,6 +235,7 @@ public class TestLockFactory extends TestCase {
|
||||||
try {
|
try {
|
||||||
writer2 = new IndexWriter(indexDirName, new WhitespaceAnalyzer(), false);
|
writer2 = new IndexWriter(indexDirName, new WhitespaceAnalyzer(), false);
|
||||||
} catch (IOException e) {
|
} catch (IOException e) {
|
||||||
|
e.printStackTrace(System.out);
|
||||||
fail("Should not have hit an IOException with locking disabled");
|
fail("Should not have hit an IOException with locking disabled");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -266,6 +268,7 @@ public class TestLockFactory extends TestCase {
|
||||||
try {
|
try {
|
||||||
fs2 = FSDirectory.getDirectory(indexDirName, true, lf);
|
fs2 = FSDirectory.getDirectory(indexDirName, true, lf);
|
||||||
} catch (IOException e) {
|
} catch (IOException e) {
|
||||||
|
e.printStackTrace(System.out);
|
||||||
fail("Should not have hit an IOException because LockFactory instances are the same");
|
fail("Should not have hit an IOException because LockFactory instances are the same");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -294,7 +297,6 @@ public class TestLockFactory extends TestCase {
|
||||||
|
|
||||||
public void _testStressLocks(LockFactory lockFactory, String indexDirName) throws IOException {
|
public void _testStressLocks(LockFactory lockFactory, String indexDirName) throws IOException {
|
||||||
FSDirectory fs1 = FSDirectory.getDirectory(indexDirName, true, lockFactory);
|
FSDirectory fs1 = FSDirectory.getDirectory(indexDirName, true, lockFactory);
|
||||||
// fs1.setLockFactory(NoLockFactory.getNoLockFactory());
|
|
||||||
|
|
||||||
// First create a 1 doc index:
|
// First create a 1 doc index:
|
||||||
IndexWriter w = new IndexWriter(fs1, new WhitespaceAnalyzer(), true);
|
IndexWriter w = new IndexWriter(fs1, new WhitespaceAnalyzer(), true);
|
||||||
|
@ -405,6 +407,7 @@ public class TestLockFactory extends TestCase {
|
||||||
hitException = true;
|
hitException = true;
|
||||||
System.out.println("Stress Test Index Writer: creation hit unexpected exception: " + e.toString());
|
System.out.println("Stress Test Index Writer: creation hit unexpected exception: " + e.toString());
|
||||||
e.printStackTrace(System.out);
|
e.printStackTrace(System.out);
|
||||||
|
break;
|
||||||
}
|
}
|
||||||
if (writer != null) {
|
if (writer != null) {
|
||||||
try {
|
try {
|
||||||
|
@ -413,6 +416,7 @@ public class TestLockFactory extends TestCase {
|
||||||
hitException = true;
|
hitException = true;
|
||||||
System.out.println("Stress Test Index Writer: addDoc hit unexpected exception: " + e.toString());
|
System.out.println("Stress Test Index Writer: addDoc hit unexpected exception: " + e.toString());
|
||||||
e.printStackTrace(System.out);
|
e.printStackTrace(System.out);
|
||||||
|
break;
|
||||||
}
|
}
|
||||||
try {
|
try {
|
||||||
writer.close();
|
writer.close();
|
||||||
|
@ -420,6 +424,7 @@ public class TestLockFactory extends TestCase {
|
||||||
hitException = true;
|
hitException = true;
|
||||||
System.out.println("Stress Test Index Writer: close hit unexpected exception: " + e.toString());
|
System.out.println("Stress Test Index Writer: close hit unexpected exception: " + e.toString());
|
||||||
e.printStackTrace(System.out);
|
e.printStackTrace(System.out);
|
||||||
|
break;
|
||||||
}
|
}
|
||||||
writer = null;
|
writer = null;
|
||||||
}
|
}
|
||||||
|
@ -446,6 +451,7 @@ public class TestLockFactory extends TestCase {
|
||||||
hitException = true;
|
hitException = true;
|
||||||
System.out.println("Stress Test Index Searcher: create hit unexpected exception: " + e.toString());
|
System.out.println("Stress Test Index Searcher: create hit unexpected exception: " + e.toString());
|
||||||
e.printStackTrace(System.out);
|
e.printStackTrace(System.out);
|
||||||
|
break;
|
||||||
}
|
}
|
||||||
if (searcher != null) {
|
if (searcher != null) {
|
||||||
Hits hits = null;
|
Hits hits = null;
|
||||||
|
@ -455,6 +461,7 @@ public class TestLockFactory extends TestCase {
|
||||||
hitException = true;
|
hitException = true;
|
||||||
System.out.println("Stress Test Index Searcher: search hit unexpected exception: " + e.toString());
|
System.out.println("Stress Test Index Searcher: search hit unexpected exception: " + e.toString());
|
||||||
e.printStackTrace(System.out);
|
e.printStackTrace(System.out);
|
||||||
|
break;
|
||||||
}
|
}
|
||||||
// System.out.println(hits.length() + " total results");
|
// System.out.println(hits.length() + " total results");
|
||||||
try {
|
try {
|
||||||
|
@ -463,6 +470,7 @@ public class TestLockFactory extends TestCase {
|
||||||
hitException = true;
|
hitException = true;
|
||||||
System.out.println("Stress Test Index Searcher: close hit unexpected exception: " + e.toString());
|
System.out.println("Stress Test Index Searcher: close hit unexpected exception: " + e.toString());
|
||||||
e.printStackTrace(System.out);
|
e.printStackTrace(System.out);
|
||||||
|
break;
|
||||||
}
|
}
|
||||||
searcher = null;
|
searcher = null;
|
||||||
}
|
}
|
||||||
|
|
|
@ -14,7 +14,7 @@
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
This document defines the index file formats used
|
This document defines the index file formats used
|
||||||
in Lucene version 2.0. If you are using a different
|
in Lucene version 2.1. If you are using a different
|
||||||
version of Lucene, please consult the copy of
|
version of Lucene, please consult the copy of
|
||||||
<code>docs/fileformats.html</code> that was distributed
|
<code>docs/fileformats.html</code> that was distributed
|
||||||
with the version you are using.
|
with the version you are using.
|
||||||
|
@ -43,6 +43,18 @@
|
||||||
describing how file formats have changed from prior versions.
|
describing how file formats have changed from prior versions.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
In version 2.1, the file format was changed to allow
|
||||||
|
lock-less commits (ie, no more commit lock). The
|
||||||
|
change is fully backwards compatible: you can open a
|
||||||
|
pre-2.1 index for searching or adding/deleting of
|
||||||
|
docs. When the new segments file is saved
|
||||||
|
(committed), it will be written in the new file format
|
||||||
|
(meaning no specific "upgrade" process is needed).
|
||||||
|
But note that once a commit has occurred, pre-2.1
|
||||||
|
Lucene will not be able to read the index.
|
||||||
|
</p>
|
||||||
|
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section name="Definitions">
|
<section name="Definitions">
|
||||||
|
@ -260,6 +272,18 @@
|
||||||
required.
|
required.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
As of version 2.1 (lock-less commits), file names are
|
||||||
|
never re-used (there is one exception, "segments.gen",
|
||||||
|
see below). That is, when any file is saved to the
|
||||||
|
Directory it is given a never before used filename.
|
||||||
|
This is achieved using a simple generations approach.
|
||||||
|
For example, the first segments file is segments_1,
|
||||||
|
then segments_2, etc. The generation is a sequential
|
||||||
|
long integer represented in alpha-numeric (base 36)
|
||||||
|
form.
|
||||||
|
</p>
|
||||||
|
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section name="Primitive Types">
|
<section name="Primitive Types">
|
||||||
|
@ -696,22 +720,48 @@
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
The active segments in the index are stored in the
|
The active segments in the index are stored in the
|
||||||
segment info file. An index only has
|
segment info file, <tt>segments_N</tt>. There may
|
||||||
a single file in this format, and it is named "segments".
|
be one or more <tt>segments_N</tt> files in the
|
||||||
This lists each segment by name, and also contains the size of each
|
index; however, the one with the largest
|
||||||
segment.
|
generation is the active one (when older
|
||||||
|
segments_N files are present it's because they
|
||||||
|
temporarily cannot be deleted, or, a writer is in
|
||||||
|
the process of committing). This file lists each
|
||||||
|
segment by name, has details about the separate
|
||||||
|
norms and deletion files, and also contains the
|
||||||
|
size of each segment.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
|
As of 2.1, there is also a file
|
||||||
|
<tt>segments.gen</tt>. This file contains the
|
||||||
|
current generation (the <tt>_N</tt> in
|
||||||
|
<tt>segments_N</tt>) of the index. This is
|
||||||
|
used only as a fallback in case the current
|
||||||
|
generation cannot be accurately determined by
|
||||||
|
directory listing alone (as is the case for some
|
||||||
|
NFS clients with time-based directory cache
|
||||||
|
expiraation). This file simply contains an Int32
|
||||||
|
version header (SegmentInfos.FORMAT_LOCKLESS =
|
||||||
|
-2), followed by the generation recorded as Int64,
|
||||||
|
written twice.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
<b>Pre-2.1:</b>
|
||||||
Segments --> Format, Version, NameCounter, SegCount, <SegName, SegSize><sup>SegCount</sup>
|
Segments --> Format, Version, NameCounter, SegCount, <SegName, SegSize><sup>SegCount</sup>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
Format, NameCounter, SegCount, SegSize --> UInt32
|
<b>2.1 and above:</b>
|
||||||
|
Segments --> Format, Version, NameCounter, SegCount, <SegName, SegSize, DelGen, NumField, NormGen<sup>NumField</sup> ><sup>SegCount</sup>, IsCompoundFile
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
Version --> UInt64
|
Format, NameCounter, SegCount, SegSize, NumField --> Int32
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
Version, DelGen, NormGen --> Int64
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
|
@ -719,7 +769,11 @@
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
Format is -1 in Lucene 1.4.
|
IsCompoundFile --> Int8
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
Format is -1 as of Lucene 1.4 and -2 as of Lucene 2.1.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
|
@ -740,65 +794,79 @@
|
||||||
SegSize is the number of documents contained in the segment index.
|
SegSize is the number of documents contained in the segment index.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
DelGen is the generation count of the separate
|
||||||
|
deletes file. If this is -1, there are no
|
||||||
|
separate deletes. If it is 0, this is a pre-2.1
|
||||||
|
segment and you must check filesystem for the
|
||||||
|
existence of _X.del. Anything above zero means
|
||||||
|
there are separate deletes (_X_N.del).
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
NumField is the size of the array for NormGen, or
|
||||||
|
-1 if there are no NormGens stored.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
NormGen records the generation of the separate
|
||||||
|
norms files. If NumField is -1, there are no
|
||||||
|
normGens stored and they are all assumed to be 0
|
||||||
|
when the segment file was written pre-2.1 and all
|
||||||
|
assumed to be -1 when the segments file is 2.1 or
|
||||||
|
above. The generation then has the same meaning
|
||||||
|
as delGen (above).
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
IsCompoundFile records whether the segment is
|
||||||
|
written as a compound file or not. If this is -1,
|
||||||
|
the segment is not a compound file. If it is 1,
|
||||||
|
the segment is a compound file. Else it is 0,
|
||||||
|
which means we check filesystem to see if _X.cfs
|
||||||
|
exists.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
|
||||||
</subsection>
|
</subsection>
|
||||||
|
|
||||||
<subsection name="Lock Files">
|
<subsection name="Lock File">
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
Several files are used to indicate that another
|
A write lock is used to indicate that another
|
||||||
process is using an index. Note that these files are not
|
process is writing to the index. Note that this file is not
|
||||||
stored in the index directory itself, but rather in the
|
stored in the index directory itself, but rather in the
|
||||||
system's temporary directory, as indicated in the Java
|
system's temporary directory, as indicated in the Java
|
||||||
system property "java.io.tmpdir".
|
system property "java.io.tmpdir".
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>
|
|
||||||
<p>
|
<p>
|
||||||
When a file named "commit.lock"
|
The write lock is named "XXXX-write.lock" where
|
||||||
is present, a process is currently re-writing the "segments"
|
XXXX is typically a unique prefix computed by the
|
||||||
file and deleting outdated segment index files, or a process is
|
directory path to the index. When this file is
|
||||||
reading the "segments"
|
present, a process is currently adding documents
|
||||||
file and opening the files of the segments it names. This lock file
|
to an index, or removing files from that index.
|
||||||
prevents files from being deleted by another process after a process
|
This lock file prevents several processes from
|
||||||
has read the "segments"
|
attempting to modify an index at the same time.
|
||||||
file but before it has managed to open all of the files of the
|
|
||||||
segments named therein.
|
|
||||||
</p>
|
</p>
|
||||||
</li>
|
|
||||||
|
|
||||||
<li>
|
|
||||||
<p>
|
<p>
|
||||||
When a file named "write.lock"
|
Note that prior to version 2.1, Lucene also used a
|
||||||
is present, a process is currently adding documents to an index, or
|
commit lock. This was removed in 2.1.
|
||||||
removing files from that index. This lock file prevents several
|
|
||||||
processes from attempting to modify an index at the same time.
|
|
||||||
</p>
|
</p>
|
||||||
</li>
|
|
||||||
</ul>
|
|
||||||
</subsection>
|
</subsection>
|
||||||
|
|
||||||
<subsection name="Deletable File">
|
<subsection name="Deletable File">
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
A file named "deletable"
|
Prior to Lucene 2.1 there was a file "deletable"
|
||||||
contains the names of files that are no longer used by the index, but
|
that contained details about files that need to be
|
||||||
which could not be deleted. This is only used on Win32, where a
|
deleted. As of 2.1, a writer dynamically computes
|
||||||
file may not be deleted while it is still open. On other platforms
|
the files that are deletable, instead, so no file
|
||||||
the file contains only null bytes.
|
is written.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>
|
|
||||||
Deletable --> DeletableCount,
|
|
||||||
<DelableName><sup>DeletableCount</sup>
|
|
||||||
</p>
|
|
||||||
|
|
||||||
<p>DeletableCount --> UInt32
|
|
||||||
</p>
|
|
||||||
<p>DeletableName -->
|
|
||||||
String
|
|
||||||
</p>
|
|
||||||
</subsection>
|
</subsection>
|
||||||
|
|
||||||
<subsection name="Compound Files">
|
<subsection name="Compound Files">
|
||||||
|
|
Loading…
Reference in New Issue