lucene/Attic/api/overview.html

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
   <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
   <meta name="Author" content="Doug Cutting">
   <meta name="Author" content="Ted Husted">
   <meta name="GENERATOR" content="Mozilla/4.72 [en] (Win98; U) [Netscape]">
   <title>Jakarta Lucene API Documentation</title>
</head>
<body>

<h1>Jakarta Lucene API Documentation</h1>
The <a href="http://jakarta.apache.org/lucene">Jakarta Lucene</a> API is divided into several
packages:
<ul>
<li>
<b><a href="org/apache/lucene/util/package-summary.html">com.lucene.util</a></b>
contains a few handy data structures, e.g., <a href="org/apache/lucene/util/BitVector.html">BitVector</a>
and <a href="org/apache/lucene/util/PriorityQueue.html">PriorityQueue</a>.</li>

<li>
<b><a href="org/apache/lucene/store/package-summary.html">com.lucene.store</a></b>
defines an abstract class for storing persistent data, the <a href="org/apache/lucene/store/Directory.html">Directory</a>,
a collection of named files written by an <a href="org/apache/lucene/store/OutputStream.html">OutputStream</a>
and read by an <a href="org/apache/lucene/store/InputStream.html">InputStream</a>.&nbsp;
Two implementations are provided, <a href="org/apache/lucene/store/FSDirectory.html">FSDirectory</a>,
which uses a file system directory to store files, and <a href="org/apache/lucene/store/RAMDirectory.html">RAMDirectory</a>
which implements files as memory-resident data structures.</li>

<li>
<b><a href="org/apache/lucene/document/package-summary.html">com.lucene.document</a></b>
provides a simple <a href="org/apache/lucene/document/Document.html">Document</a>
class.&nbsp; A document is simply a set of named <a href="org/apache/lucene/document/Field.html">Field</a>'s,
whose values may be strings or instances of <a href="http://java.sun.com/products/jdk/1.2/docs/api/java/io/Reader.html">java.io.Reader</a>.</li>

<li>
<b><a href="org/apache/lucene/analysis/package-summary.html">com.lucene.analysis</a></b>
defines an abstract <a href="org/apache/lucene/analysis/Analyzer.html">Analyzer</a>
API for converting text from a <a href="http://java.sun.com/products/jdk/1.2/docs/api/java/io/Reader.html">java.io.Reader</a>
into a <a href="org/apache/lucene/analysis/TokenStream.html">TokenStream</a>,
an enumeration of&nbsp; <a href="org/apache/lucene/analysis/Token.html">Token</a>'s.&nbsp;
A TokenStream is composed by applying <a href="org/apache/lucene/analysis/TokenFilter.html">TokenFilter</a>'s
to the output of a <a href="org/apache/lucene/analysis/Tokenizer.html">Tokenizer</a>.&nbsp;
A few simple implemenations are provided, including <a href="org/apache/lucene/analysis/StopAnalyzer.html">StopAnalyzer</a>
and the grammar-based <a href="org/apache/lucene/analysis/standard/StandardAnalyzer.html">StandardAnalyzer</a>.</li>

<li>
<b><a href="org/apache/lucene/index/package-summary.html">com.lucene.index</a></b>
provides two primary classes: <a href="org/apache/lucene/index/IndexWriter.html">IndexWriter</a>,
which creates and adds documents to indices; and <a href="org/apache/lucene/index/IndexReader.html">IndexReader</a>,
which accesses the data in the index.</li>

<li>
<b><a href="org/apache/lucene/search/package-summary.html">com.lucene.search</a></b>
provides data structures to represent queries (<a href="org/apache/lucene/search/TermQuery.html">TermQuery</a>
for individual words, <a href="org/apache/lucene/search/PhraseQuery.html">PhraseQuery</a>
for phrases, and <a href="org/apache/lucene/search/BooleanQuery.html">BooleanQuery</a>
for boolean combinations of queries) and the abstract <a href="org/apache/lucene/search/Searcher.html">Searcher</a>
which turns queries into <a href="org/apache/lucene/search/Hits.html">Hits</a>.
<a href="org/apache/lucene/search/IndexSearcher.html">IndexSearcher</a>
implements search over a single IndexReader.</li>

<li>
<b><a href="org/apache/lucene/queryParser/package-summary.html">com.lucene.queryParser</a></b>
uses <a href="http://www.suntest.com/JavaCC/">JavaCC</a> to implement a
<a href="org/apache/lucene/queryParser/QueryParser.html">QueryParser</a>.</li>
</ul>
To use Lucene, an application should:
<ol>
<li>
Create <a href="org/apache/lucene/document/Document.html">Document</a>'s by
adding
<a href="org/apache/lucene/document/Field.html">Field</a>'s.</li>

<li>
Create an <a href="org/apache/lucene/index/IndexWriter.html">IndexWriter</a>
and add documents to to it with <a href="org/apache/lucene/index/IndexWriter.html#addDocument(com.lucene.document.Document)">addDocument()</a>;</li>

<li>
Call <a href="org/apache/lucene/queryParser/QueryParser.html#parse(java.lang.String)">QueryParser.parse()</a>
to build a query from a string; and</li>

<li>
Create an <a href="org/apache/lucene/search/IndexSearcher.html">IndexSearcher</a>
and pass the query to it's <a href="org/apache/lucene/search/Searcher.html#search(com.lucene.search.Query)">search()</a>
method.</li>
</ol>
Some simple examples of code which does this are:
<ul>
<li>
&nbsp;<a href="../demo/FileDocument.java">FileDocument.java</a> contains
code to create a Document for a file.</li>

<li>
&nbsp;<a href="../demo/IndexFiles.java">IndexFiles.java</a> creates an
index for all the files contained in a directory.</li>

<li>
&nbsp;<a href="../demo/DeleteFiles.java">DeleteFiles.java</a> deletes some
of these files from the index.</li>

<li>
&nbsp;<a href="../demo/SearchFiles.java">SearchFiles.java</a> prompts for
queries and searches an index.</li>
</ul>
To demonstrate these, try:
<blockquote><tt>F:\> <b>java demo.IndexFiles rec.food.recipes\soups</b></tt>
<br><tt>adding rec.food.recipes\soups\abalone-chowder</tt>
<br><tt>&nbsp; </tt>[ ... ]
<p><tt>F:\> <b>java demo.SearchFiles</b></tt>
<br><tt>Query: <b>chowder</b></tt>
<br><tt>Searching for: chowder</tt>
<br><tt>34 total matching documents</tt>
<br><tt>0. rec.food.recipes\soups\spam-chowder</tt>
<br><tt>&nbsp; </tt>[ ... thirty-four documents contain the word "chowder",
"spam-chowder" with the greatest density.]
<p><tt>Query: <b>path:chowder</b></tt>
<br><tt>Searching for: path:chowder</tt>
<br><tt>31 total matching documents</tt>
<br><tt>0. rec.food.recipes\soups\abalone-chowder</tt>
<br><tt>&nbsp; </tt>[ ... only thrity-one have "chowder" in the "path"
field. ]
<p><tt>Query: <b>path:"clam chowder"</b></tt>
<br><tt>Searching for: path:"clam chowder"</tt>
<br><tt>10 total matching documents</tt>
<br><tt>0. rec.food.recipes\soups\clam-chowder</tt>
<br><tt>&nbsp; </tt>[ ... only ten have "clam chowder" in the "path" field.
]
<p><tt>Query: <b>path:"clam chowder" AND manhattan</b></tt>
<br><tt>Searching for: +path:"clam chowder" +manhattan</tt>
<br><tt>2 total matching documents</tt>
<br><tt>0. rec.food.recipes\soups\clam-chowder</tt>
<br><tt>&nbsp; </tt>[ ... only two also have "manhattan" in the contents.
]
<br>&nbsp;&nbsp;&nbsp; [ Note: "+" and "-" are canonical, but "AND", "OR"
and "NOT" may be used. ]</blockquote>
The <a href="../demo/IndexHTML.java">IndexHtml</a> demo is more sophisticated.&nbsp;
It incrementally maintains an index of HTML files, adding new files as
they appear, deleting old files as they disappear and re-indexing files
as they change.
<blockquote><tt>F:\><b>java demo.IndexHTML -create java\jdk1.1.6\docs\relnotes</b></tt>
<br><tt>adding java/jdk1.1.6/docs/relnotes/SMICopyright.html</tt>
<br><tt>&nbsp; </tt>[ ... create an index containing all the relnotes ]
<p><tt>F:\><b>del java\jdk1.1.6\docs\relnotes\smicopyright.html</b></tt>
<p><tt>F:\><b>java demo.IndexHTML java\jdk1.1.6\docs\relnotes</b></tt>
<br><tt>deleting java/jdk1.1.6/docs/relnotes/SMICopyright.html</tt></blockquote>
HTML indexes are searched using SUN's <a href="http://jserv.javasoft.com/products/webserver/index.html">JavaWebServer</a>
(JWS) and <a href="../demo/Search.jhtml">Search.jhtml</a>.&nbsp; To use
this:
<ul>
<li>
copy <tt>Search.html</tt> and <tt>Search.jhtml</tt> to JWS's <tt>public_html</tt>
directory;</li>

<li>
copy lucene.jar to JWS's lib directory;</li>

<li>
create and maintain your indexes with demo.IndexHTML in JWS's top-level
directory;</li>

<li>
launch JWS, with the <tt>demo</tt> directory on CLASSPATH (only one class
is actually needed);</li>

<li>
visit <a href="../demo/Search.html">Search.html</a>.</li>
</ul>
Note that indexes can be updated while searches are going on.&nbsp; <tt>Search.jhtml</tt>
will re-open the index when it is updated so that the latest version is
immediately available.
<br>&nbsp;
</body>
</html>
Initial revision git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149565 13f79535-47bb-0310-9956-ffa450edef68 2001-09-11 17:44:36 -04:00			`<!doctype html public "-//w3c//dtd html 4.0 transitional//en">`
			`<html>`
			`<head>`
			`<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">`
			`<meta name="Author" content="Doug Cutting">`
			`<meta name="Author" content="Ted Husted">`
			`<meta name="GENERATOR" content="Mozilla/4.72 [en] (Win98; U) [Netscape]">`
			`<title>Jakarta Lucene API Documentation</title>`
			`</head>`
			`<body>`

			`<h1>Jakarta Lucene API Documentation</h1>`
			`The <a href="http://jakarta.apache.org/lucene">Jakarta Lucene</a> API is divided into several`
			`packages:`
			`<ul>`
			`<li>`
			`<b><a href="org/apache/lucene/util/package-summary.html">com.lucene.util</a></b>`
			`contains a few handy data structures, e.g., <a href="org/apache/lucene/util/BitVector.html">BitVector</a>`
			`and <a href="org/apache/lucene/util/PriorityQueue.html">PriorityQueue</a>.</li>`

			`<li>`
			`<b><a href="org/apache/lucene/store/package-summary.html">com.lucene.store</a></b>`
			`defines an abstract class for storing persistent data, the <a href="org/apache/lucene/store/Directory.html">Directory</a>,`
			`a collection of named files written by an <a href="org/apache/lucene/store/OutputStream.html">OutputStream</a>`
			`and read by an <a href="org/apache/lucene/store/InputStream.html">InputStream</a>. `
			`Two implementations are provided, <a href="org/apache/lucene/store/FSDirectory.html">FSDirectory</a>,`
			`which uses a file system directory to store files, and <a href="org/apache/lucene/store/RAMDirectory.html">RAMDirectory</a>`
			`which implements files as memory-resident data structures.</li>`

			`<li>`
			`<b><a href="org/apache/lucene/document/package-summary.html">com.lucene.document</a></b>`
			`provides a simple <a href="org/apache/lucene/document/Document.html">Document</a>`
			`class.  A document is simply a set of named <a href="org/apache/lucene/document/Field.html">Field</a>'s,`
			`whose values may be strings or instances of <a href="http://java.sun.com/products/jdk/1.2/docs/api/java/io/Reader.html">java.io.Reader</a>.</li>`

			`<li>`
			`<b><a href="org/apache/lucene/analysis/package-summary.html">com.lucene.analysis</a></b>`
			`defines an abstract <a href="org/apache/lucene/analysis/Analyzer.html">Analyzer</a>`
			`API for converting text from a <a href="http://java.sun.com/products/jdk/1.2/docs/api/java/io/Reader.html">java.io.Reader</a>`
			`into a <a href="org/apache/lucene/analysis/TokenStream.html">TokenStream</a>,`
			`an enumeration of  <a href="org/apache/lucene/analysis/Token.html">Token</a>'s. `
			`A TokenStream is composed by applying <a href="org/apache/lucene/analysis/TokenFilter.html">TokenFilter</a>'s`
			`to the output of a <a href="org/apache/lucene/analysis/Tokenizer.html">Tokenizer</a>. `
			`A few simple implemenations are provided, including <a href="org/apache/lucene/analysis/StopAnalyzer.html">StopAnalyzer</a>`
			`and the grammar-based <a href="org/apache/lucene/analysis/standard/StandardAnalyzer.html">StandardAnalyzer</a>.</li>`

			`<li>`
			`<b><a href="org/apache/lucene/index/package-summary.html">com.lucene.index</a></b>`
			`provides two primary classes: <a href="org/apache/lucene/index/IndexWriter.html">IndexWriter</a>,`
			`which creates and adds documents to indices; and <a href="org/apache/lucene/index/IndexReader.html">IndexReader</a>,`
			`which accesses the data in the index.</li>`

			`<li>`
			`<b><a href="org/apache/lucene/search/package-summary.html">com.lucene.search</a></b>`
			`provides data structures to represent queries (<a href="org/apache/lucene/search/TermQuery.html">TermQuery</a>`
			`for individual words, <a href="org/apache/lucene/search/PhraseQuery.html">PhraseQuery</a>`
			`for phrases, and <a href="org/apache/lucene/search/BooleanQuery.html">BooleanQuery</a>`
			`for boolean combinations of queries) and the abstract <a href="org/apache/lucene/search/Searcher.html">Searcher</a>`
			`which turns queries into <a href="org/apache/lucene/search/Hits.html">Hits</a>.`
			`<a href="org/apache/lucene/search/IndexSearcher.html">IndexSearcher</a>`
			`implements search over a single IndexReader.</li>`

			`<li>`
			`<b><a href="org/apache/lucene/queryParser/package-summary.html">com.lucene.queryParser</a></b>`
			`uses <a href="http://www.suntest.com/JavaCC/">JavaCC</a> to implement a`
			`<a href="org/apache/lucene/queryParser/QueryParser.html">QueryParser</a>.</li>`
			`</ul>`
			`To use Lucene, an application should:`
			`<ol>`
			`<li>`
			`Create <a href="org/apache/lucene/document/Document.html">Document</a>'s by`
			`adding`
			`<a href="org/apache/lucene/document/Field.html">Field</a>'s.</li>`

			`<li>`
			`Create an <a href="org/apache/lucene/index/IndexWriter.html">IndexWriter</a>`
			`and add documents to to it with <a href="org/apache/lucene/index/IndexWriter.html#addDocument(com.lucene.document.Document)">addDocument()</a>;</li>`

			`<li>`
			`Call <a href="org/apache/lucene/queryParser/QueryParser.html#parse(java.lang.String)">QueryParser.parse()</a>`
			`to build a query from a string; and</li>`

			`<li>`
			`Create an <a href="org/apache/lucene/search/IndexSearcher.html">IndexSearcher</a>`
			`and pass the query to it's <a href="org/apache/lucene/search/Searcher.html#search(com.lucene.search.Query)">search()</a>`
			`method.</li>`
			`</ol>`
			`Some simple examples of code which does this are:`
			`<ul>`
			`<li>`
			` <a href="../demo/FileDocument.java">FileDocument.java</a> contains`
			`code to create a Document for a file.</li>`

			`<li>`
			` <a href="../demo/IndexFiles.java">IndexFiles.java</a> creates an`
			`index for all the files contained in a directory.</li>`

			`<li>`
			` <a href="../demo/DeleteFiles.java">DeleteFiles.java</a> deletes some`
			`of these files from the index.</li>`

			`<li>`
			` <a href="../demo/SearchFiles.java">SearchFiles.java</a> prompts for`
			`queries and searches an index.</li>`
			`</ul>`
			`To demonstrate these, try:`
			`<blockquote><tt>F:\> <b>java demo.IndexFiles rec.food.recipes\soups</b></tt>`
			`<br><tt>adding rec.food.recipes\soups\abalone-chowder</tt>`
			`<br><tt>  </tt>[ ... ]`
			`<p><tt>F:\> <b>java demo.SearchFiles</b></tt>`
			`<br><tt>Query: <b>chowder</b></tt>`
			`<br><tt>Searching for: chowder</tt>`
			`<br><tt>34 total matching documents</tt>`
			`<br><tt>0. rec.food.recipes\soups\spam-chowder</tt>`
			`<br><tt>  </tt>[ ... thirty-four documents contain the word "chowder",`
			`"spam-chowder" with the greatest density.]`
			`<p><tt>Query: <b>path:chowder</b></tt>`
			`<br><tt>Searching for: path:chowder</tt>`
			`<br><tt>31 total matching documents</tt>`
			`<br><tt>0. rec.food.recipes\soups\abalone-chowder</tt>`
			`<br><tt>  </tt>[ ... only thrity-one have "chowder" in the "path"`
			`field. ]`
			`<p><tt>Query: <b>path:"clam chowder"</b></tt>`
			`<br><tt>Searching for: path:"clam chowder"</tt>`
			`<br><tt>10 total matching documents</tt>`
			`<br><tt>0. rec.food.recipes\soups\clam-chowder</tt>`
			`<br><tt>  </tt>[ ... only ten have "clam chowder" in the "path" field.`
			`]`
			`<p><tt>Query: <b>path:"clam chowder" AND manhattan</b></tt>`
			`<br><tt>Searching for: +path:"clam chowder" +manhattan</tt>`
			`<br><tt>2 total matching documents</tt>`
			`<br><tt>0. rec.food.recipes\soups\clam-chowder</tt>`
			`<br><tt>  </tt>[ ... only two also have "manhattan" in the contents.`
			`]`
			`<br>    [ Note: "+" and "-" are canonical, but "AND", "OR"`
			`and "NOT" may be used. ]</blockquote>`
			`The <a href="../demo/IndexHTML.java">IndexHtml</a> demo is more sophisticated. `
			`It incrementally maintains an index of HTML files, adding new files as`
			`they appear, deleting old files as they disappear and re-indexing files`
			`as they change.`
			`<blockquote><tt>F:\><b>java demo.IndexHTML -create java\jdk1.1.6\docs\relnotes</b></tt>`
			`<br><tt>adding java/jdk1.1.6/docs/relnotes/SMICopyright.html</tt>`
			`<br><tt>  </tt>[ ... create an index containing all the relnotes ]`
			`<p><tt>F:\><b>del java\jdk1.1.6\docs\relnotes\smicopyright.html</b></tt>`
			`<p><tt>F:\><b>java demo.IndexHTML java\jdk1.1.6\docs\relnotes</b></tt>`
			`<br><tt>deleting java/jdk1.1.6/docs/relnotes/SMICopyright.html</tt></blockquote>`
			`HTML indexes are searched using SUN's <a href="http://jserv.javasoft.com/products/webserver/index.html">JavaWebServer</a>`
			`(JWS) and <a href="../demo/Search.jhtml">Search.jhtml</a>.  To use`
			`this:`
			`<ul>`
			`<li>`
			`copy <tt>Search.html</tt> and <tt>Search.jhtml</tt> to JWS's <tt>public_html</tt>`
			`directory;</li>`

			`<li>`
			`copy lucene.jar to JWS's lib directory;</li>`

			`<li>`
			`create and maintain your indexes with demo.IndexHTML in JWS's top-level`
			`directory;</li>`

			`<li>`
			`launch JWS, with the <tt>demo</tt> directory on CLASSPATH (only one class`
			`is actually needed);</li>`

			`<li>`
			`visit <a href="../demo/Search.html">Search.html</a>.</li>`
			`</ul>`
			`Note that indexes can be updated while searches are going on.  <tt>Search.jhtml</tt>`
			`will re-open the index when it is updated so that the latest version is`
			`immediately available.`
			`<br> `
			`</body>`
			`</html>`