mirror of https://github.com/apache/lucene.git
- a few stray pieces of documentation. ted you can put them where you like,
i just wanted to make sure they were visible so they are put in the appropriate place. PR: Obtained from: Submitted by: Reviewed by: git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149577 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
412a00a07d
commit
1a3dd53ada
|
@ -0,0 +1,174 @@
|
||||||
|
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
|
||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
||||||
|
<meta name="Author" content="Doug Cutting">
|
||||||
|
<meta name="GENERATOR" content="Mozilla/4.72 [en] (Win98; U) [Netscape]">
|
||||||
|
<title>Lucene API Documentation</title>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
|
||||||
|
<h1>
|
||||||
|
Lucene API Documentation</h1>
|
||||||
|
The <a href="http://www.lucene.com">Lucene</a> API is divided into several
|
||||||
|
packages:
|
||||||
|
<ul>
|
||||||
|
<li>
|
||||||
|
<b><a href="api/com/lucene/util/package-summary.html">com.lucene.util</a></b>
|
||||||
|
contains a few handy data structures, e.g., <a href="api/com/lucene/util/BitVector.html">BitVector</a>
|
||||||
|
and <a href="api/com/lucene/util/PriorityQueue.html">PriorityQueue</a>.</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<b><a href="api/com/lucene/store/package-summary.html">com.lucene.store</a></b>
|
||||||
|
defines an abstract class for storing persistent data, the <a href="api/com/lucene/store/Directory.html">Directory</a>,
|
||||||
|
a collection of named files written by an <a href="api/com/lucene/store/OutputStream.html">OutputStream</a>
|
||||||
|
and read by an <a href="api/com/lucene/store/InputStream.html">InputStream</a>.
|
||||||
|
Two implementations are provided, <a href="api/com/lucene/store/FSDirectory.html">FSDirectory</a>,
|
||||||
|
which uses a file system directory to store files, and <a href="api/com/lucene/store/RAMDirectory.html">RAMDirectory</a>
|
||||||
|
which implements files as memory-resident data structures.</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<b><a href="api/com/lucene/document/package-summary.html">com.lucene.document</a></b>
|
||||||
|
provides a simple <a href="api/com/lucene/document/Document.html">Document</a>
|
||||||
|
class. A document is simply a set of named <a href="api/com/lucene/document/Field.html">Field</a>'s,
|
||||||
|
whose values may be strings or instances of <a href="http://java.sun.com/products/jdk/1.2/docs/api/java/io/Reader.html">java.io.Reader</a>.</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<b><a href="api/com/lucene/analysis/package-summary.html">com.lucene.analysis</a></b>
|
||||||
|
defines an abstract <a href="api/com/lucene/analysis/Analyzer.html">Analyzer</a>
|
||||||
|
API for converting text from a <a href="http://java.sun.com/products/jdk/1.2/docs/api/java/io/Reader.html">java.io.Reader</a>
|
||||||
|
into a <a href="api/com/lucene/analysis/TokenStream.html">TokenStream</a>,
|
||||||
|
an enumeration of <a href="api/com/lucene/analysis/Token.html">Token</a>'s.
|
||||||
|
A TokenStream is composed by applying <a href="api/com/lucene/analysis/TokenFilter.html">TokenFilter</a>'s
|
||||||
|
to the output of a <a href="api/com/lucene/analysis/Tokenizer.html">Tokenizer</a>.
|
||||||
|
A few simple implemenations are provided, including <a href="api/com/lucene/analysis/StopAnalyzer.html">StopAnalyzer</a>
|
||||||
|
and the grammar-based <a href="api/com/lucene/analysis/standard/StandardAnalyzer.html">StandardAnalyzer</a>.</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<b><a href="api/com/lucene/index/package-summary.html">com.lucene.index</a></b>
|
||||||
|
provides two primary classes: <a href="api/com/lucene/index/IndexWriter.html">IndexWriter</a>,
|
||||||
|
which creates and adds documents to indices; and <a href="api/com/lucene/index/IndexReader.html">IndexReader</a>,
|
||||||
|
which accesses the data in the index.</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<b><a href="api/com/lucene/search/package-summary.html">com.lucene.search</a></b>
|
||||||
|
provides data structures to represent queries (<a href="api/com/lucene/search/TermQuery.html">TermQuery</a>
|
||||||
|
for individual words, <a href="api/com/lucene/search/PhraseQuery.html">PhraseQuery</a>
|
||||||
|
for phrases, and <a href="api/com/lucene/search/BooleanQuery.html">BooleanQuery</a>
|
||||||
|
for boolean combinations of queries) and the abstract <a href="api/com/lucene/search/Searcher.html">Searcher</a>
|
||||||
|
which turns queries into <a href="api/com/lucene/search/Hits.html">Hits</a>.
|
||||||
|
<a href="api/com/lucene/search/IndexSearcher.html">IndexSearcher</a>
|
||||||
|
implements search over a single IndexReader.</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<b><a href="api/com/lucene/queryParser/package-summary.html">com.lucene.queryParser</a></b>
|
||||||
|
uses <a href="http://www.suntest.com/JavaCC/">JavaCC</a> to implement a
|
||||||
|
<a href="api/com/lucene/queryParser/QueryParser.html">QueryParser</a>.</li>
|
||||||
|
</ul>
|
||||||
|
To use Lucene, an application should:
|
||||||
|
<ol>
|
||||||
|
<li>
|
||||||
|
Create <a href="api/com/lucene/document/Document.html">Document</a>'s by
|
||||||
|
adding
|
||||||
|
<a href="api/com/lucene/document/Field.html">Field</a>'s.</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
Create an <a href="api/com/lucene/index/IndexWriter.html">IndexWriter</a>
|
||||||
|
and add documents to to it with <a href="api/com/lucene/index/IndexWriter.html#addDocument(com.lucene.document.Document)">addDocument()</a>;</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
Call <a href="api/com/lucene/queryParser/QueryParser.html#parse(java.lang.String)">QueryParser.parse()</a>
|
||||||
|
to build a query from a string; and</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
Create an <a href="api/com/lucene/search/IndexSearcher.html">IndexSearcher</a>
|
||||||
|
and pass the query to it's <a href="api/com/lucene/search/Searcher.html#search(com.lucene.search.Query)">search()</a>
|
||||||
|
method.</li>
|
||||||
|
</ol>
|
||||||
|
Some simple examples of code which does this are:
|
||||||
|
<ul>
|
||||||
|
<li>
|
||||||
|
<a href="../demo/FileDocument.java">FileDocument.java</a> contains
|
||||||
|
code to create a Document for a file.</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<a href="../demo/IndexFiles.java">IndexFiles.java</a> creates an
|
||||||
|
index for all the files contained in a directory.</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<a href="../demo/DeleteFiles.java">DeleteFiles.java</a> deletes some
|
||||||
|
of these files from the index.</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<a href="../demo/SearchFiles.java">SearchFiles.java</a> prompts for
|
||||||
|
queries and searches an index.</li>
|
||||||
|
</ul>
|
||||||
|
To demonstrate these, try:
|
||||||
|
<blockquote><tt>F:\> <b>java demo.IndexFiles rec.food.recipes\soups</b></tt>
|
||||||
|
<br><tt>adding rec.food.recipes\soups\abalone-chowder</tt>
|
||||||
|
<br><tt> </tt>[ ... ]
|
||||||
|
<p><tt>F:\> <b>java demo.SearchFiles</b></tt>
|
||||||
|
<br><tt>Query: <b>chowder</b></tt>
|
||||||
|
<br><tt>Searching for: chowder</tt>
|
||||||
|
<br><tt>34 total matching documents</tt>
|
||||||
|
<br><tt>0. rec.food.recipes\soups\spam-chowder</tt>
|
||||||
|
<br><tt> </tt>[ ... thirty-four documents contain the word "chowder",
|
||||||
|
"spam-chowder" with the greatest density.]
|
||||||
|
<p><tt>Query: <b>path:chowder</b></tt>
|
||||||
|
<br><tt>Searching for: path:chowder</tt>
|
||||||
|
<br><tt>31 total matching documents</tt>
|
||||||
|
<br><tt>0. rec.food.recipes\soups\abalone-chowder</tt>
|
||||||
|
<br><tt> </tt>[ ... only thrity-one have "chowder" in the "path"
|
||||||
|
field. ]
|
||||||
|
<p><tt>Query: <b>path:"clam chowder"</b></tt>
|
||||||
|
<br><tt>Searching for: path:"clam chowder"</tt>
|
||||||
|
<br><tt>10 total matching documents</tt>
|
||||||
|
<br><tt>0. rec.food.recipes\soups\clam-chowder</tt>
|
||||||
|
<br><tt> </tt>[ ... only ten have "clam chowder" in the "path" field.
|
||||||
|
]
|
||||||
|
<p><tt>Query: <b>path:"clam chowder" AND manhattan</b></tt>
|
||||||
|
<br><tt>Searching for: +path:"clam chowder" +manhattan</tt>
|
||||||
|
<br><tt>2 total matching documents</tt>
|
||||||
|
<br><tt>0. rec.food.recipes\soups\clam-chowder</tt>
|
||||||
|
<br><tt> </tt>[ ... only two also have "manhattan" in the contents.
|
||||||
|
]
|
||||||
|
<br> [ Note: "+" and "-" are canonical, but "AND", "OR"
|
||||||
|
and "NOT" may be used. ]</blockquote>
|
||||||
|
The <a href="../demo/IndexHTML.java">IndexHtml</a> demo is more sophisticated.
|
||||||
|
It incrementally maintains an index of HTML files, adding new files as
|
||||||
|
they appear, deleting old files as they disappear and re-indexing files
|
||||||
|
as they change.
|
||||||
|
<blockquote><tt>F:\><b>java demo.IndexHTML -create java\jdk1.1.6\docs\relnotes</b></tt>
|
||||||
|
<br><tt>adding java/jdk1.1.6/docs/relnotes/SMICopyright.html</tt>
|
||||||
|
<br><tt> </tt>[ ... create an index containing all the relnotes ]
|
||||||
|
<p><tt>F:\><b>del java\jdk1.1.6\docs\relnotes\smicopyright.html</b></tt>
|
||||||
|
<p><tt>F:\><b>java demo.IndexHTML java\jdk1.1.6\docs\relnotes</b></tt>
|
||||||
|
<br><tt>deleting java/jdk1.1.6/docs/relnotes/SMICopyright.html</tt></blockquote>
|
||||||
|
HTML indexes are searched using SUN's <a href="http://jserv.javasoft.com/products/webserver/index.html">JavaWebServer</a>
|
||||||
|
(JWS) and <a href="../demo/Search.jhtml">Search.jhtml</a>. To use
|
||||||
|
this:
|
||||||
|
<ul>
|
||||||
|
<li>
|
||||||
|
copy <tt>Search.html</tt> and <tt>Search.jhtml</tt> to JWS's <tt>public_html</tt>
|
||||||
|
directory;</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
copy lucene.jar to JWS's lib directory;</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
create and maintain your indexes with demo.IndexHTML in JWS's top-level
|
||||||
|
directory;</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
launch JWS, with the <tt>demo</tt> directory on CLASSPATH (only one class
|
||||||
|
is actually needed);</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
visit <a href="../demo/Search.html">Search.html</a>.</li>
|
||||||
|
</ul>
|
||||||
|
Note that indexes can be updated while searches are going on. <tt>Search.jhtml</tt>
|
||||||
|
will re-open the index when it is updated so that the latest version is
|
||||||
|
immediately available.
|
||||||
|
<br>
|
||||||
|
</body>
|
||||||
|
</html>
|
|
@ -0,0 +1,128 @@
|
||||||
|
<HTML>
|
||||||
|
<HEAD>
|
||||||
|
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
|
||||||
|
<META NAME="Author" CONTENT="Doug Cutting">
|
||||||
|
<META NAME="GENERATOR" CONTENT="Mozilla/4.04 [en] (Win95; U) [Netscape]">
|
||||||
|
<TITLE>Lucene: a full-text search engine in Java</TITLE>
|
||||||
|
</HEAD>
|
||||||
|
<BODY>
|
||||||
|
|
||||||
|
<H1>
|
||||||
|
Lucene</H1>
|
||||||
|
Lucene is a full-text search engine written in Java. It is efficient,
|
||||||
|
providing high-performance indexing and searching using few system resources.
|
||||||
|
State-of-the-art search algorithms produce highest-quality search results.
|
||||||
|
The use of Java allows easy integration with cross-platform applications.
|
||||||
|
<H2>
|
||||||
|
Potential Applications</H2>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>
|
||||||
|
<B>Searchable E-Mail</B></LI>
|
||||||
|
|
||||||
|
<BR>Search large e-mail archives instantly; update index as new messages
|
||||||
|
arrive.
|
||||||
|
<LI>
|
||||||
|
<B>CD-ROM-based Online Documentation Search</B></LI>
|
||||||
|
|
||||||
|
<BR>Search large publications quickly with platform-independent system.
|
||||||
|
<LI>
|
||||||
|
<B>Search Previously-Visited Web Pages</B></LI>
|
||||||
|
|
||||||
|
<BR>Relocate a page seen weeks or months ago.
|
||||||
|
<LI>
|
||||||
|
<B>Web Site Searching</B></LI>
|
||||||
|
|
||||||
|
<BR>Let users search all the pages on your website.</UL>
|
||||||
|
|
||||||
|
<H2>
|
||||||
|
Features</H2>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>
|
||||||
|
<B>Scalable, High-Performance Indexing</B></LI>
|
||||||
|
|
||||||
|
<DL>
|
||||||
|
<DL>
|
||||||
|
<LI>
|
||||||
|
over 200MB/hour on Pentium II/266</LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
incremental indexing as fast as batch indexing</LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
small RAM requirements -- only 1MB heap</LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
index size roughly 30% the size of text indexed</LI>
|
||||||
|
</DL>
|
||||||
|
</DL>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<B>Powerful, Accurate and Efficient Search Algorithms</B></LI>
|
||||||
|
|
||||||
|
<DL>
|
||||||
|
<DL>
|
||||||
|
<LI>
|
||||||
|
ranked searching -- best results returned first</LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
boolean and phrase queries</LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
fielded searching (e.g., title, author, contents)</LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
date-range searching</LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<B><I>coming soon:</I></B></LI>
|
||||||
|
|
||||||
|
<DL>
|
||||||
|
<DL>
|
||||||
|
<LI>
|
||||||
|
<I>multiple-index searching with merged results</I></LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<I>distributed searching over a network</I></LI>
|
||||||
|
</DL>
|
||||||
|
</DL>
|
||||||
|
</DL>
|
||||||
|
</DL>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<B>Simple API's allow developers to:</B></LI>
|
||||||
|
|
||||||
|
<DL>
|
||||||
|
<DL>
|
||||||
|
<LI>
|
||||||
|
incorporate new document types</LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
localize for new languages (already handles most European languages)</LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
develop new user interfaces</LI>
|
||||||
|
</DL>
|
||||||
|
</DL>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<B>Cross-Platform Solution</B></LI>
|
||||||
|
|
||||||
|
<DL>
|
||||||
|
<DL>
|
||||||
|
<LI>
|
||||||
|
100%-pure Java <I>(not yet certified)</I></LI>
|
||||||
|
</DL>
|
||||||
|
</DL>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<H2>
|
||||||
|
Contact</H2>
|
||||||
|
|
||||||
|
<UL><B>Douglass R. Cutting</B>
|
||||||
|
<BR>Email: cutting@lucene.com
|
||||||
|
<BR>Phone: 1 (510) 595-0232</UL>
|
||||||
|
|
||||||
|
</BODY>
|
||||||
|
</HTML>
|
Loading…
Reference in New Issue