Documentation updates following LUCENE-837.

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@521830 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Doron Cohen 2007-03-23 17:56:09 +00:00
parent 251c2b328a
commit 346b484473
8 changed files with 273 additions and 227 deletions

View File

@ -65,7 +65,7 @@ public class Benchmark {
public synchronized void execute() throws Exception { public synchronized void execute() throws Exception {
if (executed) { if (executed) {
throw new Exception("Benchmark was already executed"); throw new IllegalStateException("Benchmark was already executed");
} }
executed = true; executed = true;
algorithm.execute(); algorithm.execute();

View File

@ -41,7 +41,7 @@ import java.util.Properties;
* doc.stored=true|FALSE<br/> * doc.stored=true|FALSE<br/>
* doc.tokenized=TRUE|false<br/> * doc.tokenized=TRUE|false<br/>
* doc.term.vector=true|FALSE<br/> * doc.term.vector=true|FALSE<br/>
* doc.store.bytes=true|FALSE //Store the body contents raw UTF-8 bytes as a field<br/> * doc.store.body.bytes=true|FALSE //Store the body contents raw UTF-8 bytes as a field<br/>
*/ */
public abstract class BasicDocMaker implements DocMaker { public abstract class BasicDocMaker implements DocMaker {

View File

@ -66,6 +66,7 @@ Contained packages:
<li><a href="#properties">Benchmark properties</a></li> <li><a href="#properties">Benchmark properties</a></li>
<li><a href="#example">Example input algorithm and the result benchmark <li><a href="#example">Example input algorithm and the result benchmark
report.</a></li> report.</a></li>
<li><a href="#recsCounting">Results record counting clarified</a></li>
</ol> </ol>
</p> </p>
<a name="concept"></a> <a name="concept"></a>
@ -199,11 +200,11 @@ The following is an informal description of the supported syntax.
addDoc 60 times in a row. addDoc 60 times in a row.
</li> </li>
<li> <li>
<b>Command parameter</b>: a command can take a single parameter. <b>Command parameter</b>: a command can optionally take a single parameter.
If the certain command does not support a parameter, or if the parameter is of If the certain command does not support a parameter, or if the parameter is of
the wrong type, the wrong type,
reading the algorithm will fail with an exception and the test would not start. reading the algorithm will fail with an exception and the test would not start.
Currently the following tasks take parameters: Currently the following tasks take optional parameters:
<ul> <ul>
<li><b>AddDoc</b> takes a numeric parameter, indicating the required size of <li><b>AddDoc</b> takes a numeric parameter, indicating the required size of
added document. Note: if the DocMaker implementation used in the test added document. Note: if the DocMaker implementation used in the test
@ -215,11 +216,15 @@ The following is an informal description of the supported syntax.
fixed, so for deletion in loops it is better to use the fixed, so for deletion in loops it is better to use the
<code>doc.delete.step</code> property. <code>doc.delete.step</code> property.
</li> </li>
<li><b>SetProp</b> takes a "name,value" param, ',' used as a separator. <li><b>SetProp</b> takes a <code>name,value<code> mandatory param,
',' used as a separator.
</li> </li>
<li><b>SearchTravRetTask</b> and <b>SearchTravTask</b> take a numeric <li><b>SearchTravRetTask</b> and <b>SearchTravTask</b> take a numeric
parameter, indicating the required traversal size. parameter, indicating the required traversal size.
</li> </li>
<li><b>SearchTravRetLoadFieldSelectorTask</b> takes a string
parameter: a comma separated list of Fields to load.
</li>
</ul> </ul>
<br>Example - <font color="#FF0066">AddDoc(2000)</font> - would add a document <br>Example - <font color="#FF0066">AddDoc(2000)</font> - would add a document
of size 2000 (~bytes). of size 2000 (~bytes).
@ -485,6 +490,7 @@ Here is a list of currently defined properties:
</li><li>doc.stored </li><li>doc.stored
</li><li>doc.tokenized </li><li>doc.tokenized
</li><li>doc.term.vector </li><li>doc.term.vector
</li><li>doc.store.body.bytes
</li><li>docs.dir </li><li>docs.dir
</li><li>query.maker </li><li>query.maker
</li><li>file.query.maker.file </li><li>file.query.maker.file
@ -535,7 +541,7 @@ The following example is in conf/sample.alg:
# The comparison is done twice. # The comparison is done twice.
# #
# -------------------------------------------------------- # --------------------------------------------------------
</font>
<font color="#990066"># ------------------------------------------------------------------------------------- <font color="#990066"># -------------------------------------------------------------------------------------
# multi val params are iterated by NewRound's, added to reports, start with column name. # multi val params are iterated by NewRound's, added to reports, start with column name.
merge.factor=mrg:10:20 merge.factor=mrg:10:20
@ -606,6 +612,33 @@ PopulateShort 1 20 1000 1 20003 143.5 139.39 6
PopulateLong - - 1 20 1000 - - 1 - - 10003 - - - 77.0 - - 129.92 - 87,309,608 - 100,831,232 PopulateLong - - 1 20 1000 - - 1 - - 10003 - - - 77.0 - - 129.92 - 87,309,608 - 100,831,232
</pre> </pre>
</p> </p>
<a name="recsCounting"></a>
<h2>Results record counting clarified</h2>
<p>
Two columns in the results table indicate records counts: records-per-run and
records-per-second. What does it mean?
</p><p>
Almost every task gets 1 in this count just for being executed.
Task sequences aggregate the counts of their child tasks,
plus their own count of 1.
So, a task sequence containing 5 other task sequences, each running a single
other task 10 times, would have a count of 1 + 5 * (1 + 10) = 56.
</p><p>
The traverse and retrieve tasks "count" more: a traverse task
would add 1 for each traversed result (hit), and a retrieve task would
additionally add 1 for each retrieved doc. So, regular Search would
count 1, SearchTrav that traverses 10 hits would count 11, and a
SearchTravRet task that retrieves (and traverses) 10, would count 21.
</p><p>
Confusing? this might help: always examine the <code>elapsedSec</code> column,
and always compare "apples to apples", .i.e. it is interesting to check how the
<code>rec/s</code> changed for the same task (or sequence) between two
different runs, but it is not very useful to know how the <code>rec/s</code>
differs between <code>Search</code> and <code>SearchTrav</code> tasks. For
the latter, <code>elapsedSec</code> would bring more insight.
</p>
</DIV> </DIV>
<DIV>&nbsp;</DIV> <DIV>&nbsp;</DIV>
</BODY> </BODY>

View File

@ -33,7 +33,10 @@ import java.io.IOException;
* <p>Note: This task reuses the reader if it is already open. * <p>Note: This task reuses the reader if it is already open.
* Otherwise a reader is opened at start and closed at the end. * Otherwise a reader is opened at start and closed at the end.
* *
* Takes optional param: comma separated list of Fields to load. * <p>Takes optional param: comma separated list of Fields to load.</p>
*
* <p>Other side effects: counts additional 1 (record) for each traversed hit,
* and 1 more for each retrieved (non null) document.</p>
*/ */
public class SearchTravRetLoadFieldSelectorTask extends SearchTravTask { public class SearchTravRetLoadFieldSelectorTask extends SearchTravTask {
@ -53,7 +56,7 @@ public class SearchTravRetLoadFieldSelectorTask extends SearchTravTask {
} }
public void setParams(String params) { public void setParams(String params) {
this.params = params; this.params = params; // cannot just call super.setParams(), b/c it's params differ.
Set fieldsToLoad = new HashSet(); Set fieldsToLoad = new HashSet();
for (StringTokenizer tokenizer = new StringTokenizer(params, ","); tokenizer.hasMoreTokens();) { for (StringTokenizer tokenizer = new StringTokenizer(params, ","); tokenizer.hasMoreTokens();) {
String s = tokenizer.nextToken(); String s = tokenizer.nextToken();

View File

@ -24,8 +24,12 @@ import org.apache.lucene.benchmark.byTask.PerfRunData;
* *
* <p>Note: This task reuses the reader if it is already open. * <p>Note: This task reuses the reader if it is already open.
* Otherwise a reader is opened at start and closed at the end. * Otherwise a reader is opened at start and closed at the end.
* </p>
* *
* Takes optional param: traversal size (otherwise all results are traversed). * <p>Takes optional param: traversal size (otherwise all results are traversed).</p>
*
* <p>Other side effects: counts additional 1 (record) for each traversed hit,
* and 1 more for each retrieved (non null) document.</p>
*/ */
public class SearchTravRetTask extends SearchTravTask { public class SearchTravRetTask extends SearchTravTask {

View File

@ -27,7 +27,9 @@ import org.apache.lucene.benchmark.byTask.feeds.QueryMaker;
* Otherwise a reader is opened at start and closed at the end. * Otherwise a reader is opened at start and closed at the end.
* <p/> * <p/>
* *
* Takes optional param: traversal size (otherwise all results are traversed). * <p>Takes optional param: traversal size (otherwise all results are traversed).</p>
*
* <p>Other side effects: counts additional 1 (record) for each traversed hit.</p>
*/ */
public class SearchTravTask extends ReadTask { public class SearchTravTask extends ReadTask {
protected int traversalSize = Integer.MAX_VALUE; protected int traversalSize = Integer.MAX_VALUE;

View File

@ -21,10 +21,14 @@ import org.apache.lucene.benchmark.byTask.PerfRunData;
import org.apache.lucene.benchmark.byTask.feeds.QueryMaker; import org.apache.lucene.benchmark.byTask.feeds.QueryMaker;
/** /**
* Warm reader task. * Warm reader task: retrieve all reader documents.
* *
* <p>Note: This task reuses the reader if it is already open. * <p>Note: This task reuses the reader if it is already open.
* Otherwise a reader is opened at start and closed at the end. * Otherwise a reader is opened at start and closed at the end.
* </p>
*
* <p>Other side effects: counts additional 1 (record) for each
* retrieved (non null) document.</p>
*/ */
public class WarmTask extends ReadTask { public class WarmTask extends ReadTask {

View File

@ -44,8 +44,8 @@ public class Config {
private String algorithmText; private String algorithmText;
/** /**
* Read config from file containing both algorithm and config properties. * Read both algorithm and config properties.
* @param algFile file containing both algorithm and config properties. * @param algReader from where to read algorithm and config properties.
* @throws IOException * @throws IOException
*/ */
public Config (Reader algReader) throws IOException { public Config (Reader algReader) throws IOException {