Getting Started tutorial added by Andrew C. Oliver.

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149651 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Peter Carlson 2002-01-26 16:38:28 +00:00
parent fe92305c3c
commit c5730b429b
5 changed files with 1302 additions and 0 deletions

224
docs/demo.html Normal file
View File

@ -0,0 +1,224 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<!-- Content Stylesheet for Site -->
<!-- start the processing -->
<!-- ====================================================================== -->
<!-- Main Page Section -->
<!-- ====================================================================== -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
<meta name="author" value="Andrew C. Oliver">
<meta name="email" value="acoliver@apache.org">
<title>Jakarta Lucene - Jakarta Lucene - Building and Installing the Basic Demo</title>
</head>
<body bgcolor="#ffffff" text="#000000" link="#525D76">
<table border="0" width="100%" cellspacing="0">
<!-- TOP IMAGE -->
<tr>
<td align="left">
<a href="http://jakarta.apache.org"><img src="http://jakarta.apache.org/images/jakarta-logo.gif" border="0"/></a>
</td>
<td align="right">
<a href="http://jakarta.apache.org/lucene/"><img src="./images/lucene_green_300.gif" alt="Jakarta Lucene" border="0"/></a>
</td>
</tr>
</table>
<table border="0" width="100%" cellspacing="4">
<tr><td colspan="2">
<hr noshade="" size="1"/>
</td></tr>
<tr>
<!-- LEFT SIDE NAVIGATION -->
<td width="20%" valign="top" nowrap="true">
<p><strong>About</strong></p>
<ul>
<li> <a href="./index.html">Overview</a>
</li>
<li> <a href="./powered.html">Powered by Lucene</a>
</li>
<li> <a href="./whoweare.html">Who We Are</a>
</li>
<li> <a href="http://jakarta.apache.org/site/mail.html">Mailing Lists</a>
</li>
</ul>
<p><strong>Resources</strong></p>
<ul>
<li> <a href="http://www.lucene.com/cgi-bin/faq/faqmanager.cgi">FAQ (Official)</a>
</li>
<li> <a href="./gettingstarted.html">Getting Started</a>
</li>
<li> <a href="http://www.jguru.com/faq/Lucene">JGuru FAQ</a>
</li>
<li> <a href="http://jakarta.apache.org/site/bugs.html">Bugs</a>
</li>
<li> <a href="http://nagoya.apache.org/bugzilla/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&email1=&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailreporter2=1&bugidtype=include&bug_id=&changedin=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&product=Lucene&short_desc=&short_desc_type=allwordssubstr&long_desc=&long_desc_type=allwordssubstr&bug_file_loc=&bug_file_loc_type=allwordssubstr&keywords=&keywords_type=anywords&field0-0-0=noop&type0-0-0=noop&value0-0-0=&cmdtype=doit&order=%27Importance%27">Lucene Bugs</a>
</li>
<li> <a href="./resources.html">Articles</a>
</li>
<li> <a href="./api/index.html">Javadoc</a>
</li>
<li> <a href="./contributions.html">Contributions</a>
</li>
</ul>
<p><strong>Download</strong></p>
<ul>
<li> <a href="http://jakarta.apache.org/site/binindex.html">Binaries</a>
</li>
<li> <a href="http://jakarta.apache.org/site/sourceindex.html">Source Code</a>
</li>
<li> <a href="http://jakarta.apache.org/site/cvsindex.html">CVS Repositories</a>
</li>
</ul>
<p><strong>Jakarta</strong></p>
<ul>
<li> <a href="http://jakarta.apache.org/site/getinvolved.html">Get Involved</a>
</li>
<li> <a href="http://jakarta.apache.org/site/acknowledgements.html">Acknowledgements</a>
</li>
<li> <a href="http://jakarta.apache.org/site/contact.html">Contact</a>
</li>
<li> <a href="http://jakarta.apache.org/site/legal.html">Legal</a>
</li>
</ul>
</td>
<td width="80%" align="left" valign="top">
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="About this Document"><strong>About this Document</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
This document is intended as a "getting started" guide to using and running the
Jakarta Lucene demos. It walks you through some basic installation and configuration.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="About the Demos"><strong>About the Demos</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
The Lucene Demo code is a set of command line example applications that demonstrate various
functionality of Lucene and how one should go about adding it to their
applications.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Setting your classpath"><strong>Setting your classpath</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
First, extract the latest Lucene distribution.
</p>
<p>
You should see the Jakarta Lucene jar file in the directory you created
when you extracted the archive. It should be named something like
<b>lucene-{version}.jar</b>.
</p>
<p>
You should also see a file called called <b>lucene-demos-{version}.jar</b>.
Put both of these files in your Java CLASSPATH.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Indexing Files"><strong>Indexing Files</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
Once you've gotten this far you're probably itching to go. Let's <b> build an index!</b>
Assuming you've set your classpath correctly, just type
"java org.apache.lucene.demo.IndexFiles {full-path-to-lucene}/src". This will produce
a subdirectory called "index" which will contain an index of all of the Lucene
sourcecode.
</p>
<p>
<b> To search the index </b> type "java org.apache.lucene.demo.SearchFiles". You'll be prompted
for a query. Type in a swear word and press the enter key. You'll see that the Lucene
developers are very well mannered and get no results. Now try entering the word "vector".
That should return a whole bunch of documents. The results will page at every tenth
result and ask you whether you want more results.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="About the code..."><strong>About the code...</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
<a href="demo2.html">read on&gt;&gt;&gt;</a>
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
</td>
</tr>
<!-- FOOTER -->
<tr><td colspan="2">
<hr noshade="" size="1"/>
</td></tr>
<tr><td colspan="2">
<div align="center"><font color="#525D76" size="-1"><em>
Copyright &#169; 1999-2002, Apache Software Foundation
</em></font></div>
</td></tr>
</table>
</body>
</html>
<!-- end the processing -->

251
docs/demo2.html Normal file
View File

@ -0,0 +1,251 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<!-- Content Stylesheet for Site -->
<!-- start the processing -->
<!-- ====================================================================== -->
<!-- Main Page Section -->
<!-- ====================================================================== -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
<meta name="author" value="Andrew C. Oliver">
<meta name="email" value="acoliver@apache.org">
<title>Jakarta Lucene - Jakarta Lucene - Basic Demo Sources Walkthrough</title>
</head>
<body bgcolor="#ffffff" text="#000000" link="#525D76">
<table border="0" width="100%" cellspacing="0">
<!-- TOP IMAGE -->
<tr>
<td align="left">
<a href="http://jakarta.apache.org"><img src="http://jakarta.apache.org/images/jakarta-logo.gif" border="0"/></a>
</td>
<td align="right">
<a href="http://jakarta.apache.org/lucene/"><img src="./images/lucene_green_300.gif" alt="Jakarta Lucene" border="0"/></a>
</td>
</tr>
</table>
<table border="0" width="100%" cellspacing="4">
<tr><td colspan="2">
<hr noshade="" size="1"/>
</td></tr>
<tr>
<!-- LEFT SIDE NAVIGATION -->
<td width="20%" valign="top" nowrap="true">
<p><strong>About</strong></p>
<ul>
<li> <a href="./index.html">Overview</a>
</li>
<li> <a href="./powered.html">Powered by Lucene</a>
</li>
<li> <a href="./whoweare.html">Who We Are</a>
</li>
<li> <a href="http://jakarta.apache.org/site/mail.html">Mailing Lists</a>
</li>
</ul>
<p><strong>Resources</strong></p>
<ul>
<li> <a href="http://www.lucene.com/cgi-bin/faq/faqmanager.cgi">FAQ (Official)</a>
</li>
<li> <a href="./gettingstarted.html">Getting Started</a>
</li>
<li> <a href="http://www.jguru.com/faq/Lucene">JGuru FAQ</a>
</li>
<li> <a href="http://jakarta.apache.org/site/bugs.html">Bugs</a>
</li>
<li> <a href="http://nagoya.apache.org/bugzilla/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&email1=&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailreporter2=1&bugidtype=include&bug_id=&changedin=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&product=Lucene&short_desc=&short_desc_type=allwordssubstr&long_desc=&long_desc_type=allwordssubstr&bug_file_loc=&bug_file_loc_type=allwordssubstr&keywords=&keywords_type=anywords&field0-0-0=noop&type0-0-0=noop&value0-0-0=&cmdtype=doit&order=%27Importance%27">Lucene Bugs</a>
</li>
<li> <a href="./resources.html">Articles</a>
</li>
<li> <a href="./api/index.html">Javadoc</a>
</li>
<li> <a href="./contributions.html">Contributions</a>
</li>
</ul>
<p><strong>Download</strong></p>
<ul>
<li> <a href="http://jakarta.apache.org/site/binindex.html">Binaries</a>
</li>
<li> <a href="http://jakarta.apache.org/site/sourceindex.html">Source Code</a>
</li>
<li> <a href="http://jakarta.apache.org/site/cvsindex.html">CVS Repositories</a>
</li>
</ul>
<p><strong>Jakarta</strong></p>
<ul>
<li> <a href="http://jakarta.apache.org/site/getinvolved.html">Get Involved</a>
</li>
<li> <a href="http://jakarta.apache.org/site/acknowledgements.html">Acknowledgements</a>
</li>
<li> <a href="http://jakarta.apache.org/site/contact.html">Contact</a>
</li>
<li> <a href="http://jakarta.apache.org/site/legal.html">Legal</a>
</li>
</ul>
</td>
<td width="80%" align="left" valign="top">
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="About the Code"><strong>About the Code</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
In this section we walk through the sources behind the basic Lucene demo such as where to
find it, its parts and their function. This section is intended for Java developers
wishing to understand how to use Jakarta Lucene in their applications.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Location of the source"><strong>Location of the source</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
Relative to the directory created when you extracted Lucene or retreived it from CVS, you
should see a directory called "src" which in turn contains a directory called "demo".
This is the root for all of the Lucene demos. Under this directory is org/apache/lucene/demo,
this is where all the Java sources live.
</p>
<p>
Within this directory you should see the IndexFiles class we executed earlier. Bring that
up in vi or your alternative text editor and lets take a look at it.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="IndexFiles"><strong>IndexFiles</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
As we discussed in the previous walkthrough, the IndexFiles class creates a Lucene Index.
Lets take a look at how it does this.
</p>
<p>
The first substantial thing the main function does is instantiate an instance
of IndexWriter. It passes a string called "index" and a new instance of a class called
"StandardAnalyzer". The "index" string is the name of the directory that all index information
should be stored in. Because we're not passing any path information, one must assume this
will be created as a subdirectory of the current directory (if does not already exist). On
some platforms this may actually result in it being created in other directories (such as
the user's home directory).
</p>
<p>
The <b>IndexWriter</b> is the main class responsible for creating indicies. To use it you
must instantiate it with a path that it can write the index into, if this path does not
exist it will create it, otherwise it will refresh the index living at that path. You
must a also pass an instance of <b>org.apache.analysis.Analyzer</b>.
</p>
<p>
The <b>Analyzer</b>, in this case, the <b>Stop Analyzer</b> is little more than a standard Java
Tokenizer, converting all strings to lowercase and filtering out useless words from the index.
By useless words I mean common language words such as articles (a,an,the) and other words that
would be useless for searching. It should be noted that there are different rules for every
language, and you should use the proper analyzer for each. Lucene currently provides Analyzers
for English and German.
</p>
<p>
Looking down further in the file, you should see the indexDocs() code. This recursive function
simply crawls the directories and uses FileDocument to create Document objects. The Document
is simply a data object to represent the content in the file as well as its creation time and
location. These instances are added to the indexWriter. Take a look inside FileDocument. Its
not particularly complicated, it just adds fields to the Document.
</p>
<p>
As you can see there isn't much to creating an index. The devil is in the details. You may also
wish to examine the other samples in this directory, particularly the IndexHTML class. It is
a bit more complex but builds upon this example.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Searching Files"><strong>Searching Files</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
The SearchFiles class is quite simple. It primarily collaborates with an IndexSearcher, StandardAnalyzer
(which is used in the IndexFiles class as well) and a QueryParser. The query parser is constructed
with an analyzer used to interperate your query in the same way the Index was interperated: finding
the end of words and removing useless words like 'a', 'an' and 'the'. The Query object contains the
results from the QueryParser which is passed to the searcher. The searcher results are returned in
a collection of Documents called "Hits" which is then iterated through and displayed to the user.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="The Web example..."><strong>The Web example...</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
<a href="demo3.html">read on&gt;&gt;&gt;</a>
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
</td>
</tr>
<!-- FOOTER -->
<tr><td colspan="2">
<hr noshade="" size="1"/>
</td></tr>
<tr><td colspan="2">
<div align="center"><font color="#525D76" size="-1"><em>
Copyright &#169; 1999-2002, Apache Software Foundation
</em></font></div>
</td></tr>
</table>
</body>
</html>
<!-- end the processing -->

264
docs/demo3.html Normal file
View File

@ -0,0 +1,264 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<!-- Content Stylesheet for Site -->
<!-- start the processing -->
<!-- ====================================================================== -->
<!-- Main Page Section -->
<!-- ====================================================================== -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
<meta name="author" value="Andrew C. Oliver">
<meta name="email" value="acoliver@apache.org">
<title>Jakarta Lucene - Jakarta Lucene - Building and Installing the Basic Demo</title>
</head>
<body bgcolor="#ffffff" text="#000000" link="#525D76">
<table border="0" width="100%" cellspacing="0">
<!-- TOP IMAGE -->
<tr>
<td align="left">
<a href="http://jakarta.apache.org"><img src="http://jakarta.apache.org/images/jakarta-logo.gif" border="0"/></a>
</td>
<td align="right">
<a href="http://jakarta.apache.org/lucene/"><img src="./images/lucene_green_300.gif" alt="Jakarta Lucene" border="0"/></a>
</td>
</tr>
</table>
<table border="0" width="100%" cellspacing="4">
<tr><td colspan="2">
<hr noshade="" size="1"/>
</td></tr>
<tr>
<!-- LEFT SIDE NAVIGATION -->
<td width="20%" valign="top" nowrap="true">
<p><strong>About</strong></p>
<ul>
<li> <a href="./index.html">Overview</a>
</li>
<li> <a href="./powered.html">Powered by Lucene</a>
</li>
<li> <a href="./whoweare.html">Who We Are</a>
</li>
<li> <a href="http://jakarta.apache.org/site/mail.html">Mailing Lists</a>
</li>
</ul>
<p><strong>Resources</strong></p>
<ul>
<li> <a href="http://www.lucene.com/cgi-bin/faq/faqmanager.cgi">FAQ (Official)</a>
</li>
<li> <a href="./gettingstarted.html">Getting Started</a>
</li>
<li> <a href="http://www.jguru.com/faq/Lucene">JGuru FAQ</a>
</li>
<li> <a href="http://jakarta.apache.org/site/bugs.html">Bugs</a>
</li>
<li> <a href="http://nagoya.apache.org/bugzilla/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&email1=&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailreporter2=1&bugidtype=include&bug_id=&changedin=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&product=Lucene&short_desc=&short_desc_type=allwordssubstr&long_desc=&long_desc_type=allwordssubstr&bug_file_loc=&bug_file_loc_type=allwordssubstr&keywords=&keywords_type=anywords&field0-0-0=noop&type0-0-0=noop&value0-0-0=&cmdtype=doit&order=%27Importance%27">Lucene Bugs</a>
</li>
<li> <a href="./resources.html">Articles</a>
</li>
<li> <a href="./api/index.html">Javadoc</a>
</li>
<li> <a href="./contributions.html">Contributions</a>
</li>
</ul>
<p><strong>Download</strong></p>
<ul>
<li> <a href="http://jakarta.apache.org/site/binindex.html">Binaries</a>
</li>
<li> <a href="http://jakarta.apache.org/site/sourceindex.html">Source Code</a>
</li>
<li> <a href="http://jakarta.apache.org/site/cvsindex.html">CVS Repositories</a>
</li>
</ul>
<p><strong>Jakarta</strong></p>
<ul>
<li> <a href="http://jakarta.apache.org/site/getinvolved.html">Get Involved</a>
</li>
<li> <a href="http://jakarta.apache.org/site/acknowledgements.html">Acknowledgements</a>
</li>
<li> <a href="http://jakarta.apache.org/site/contact.html">Contact</a>
</li>
<li> <a href="http://jakarta.apache.org/site/legal.html">Legal</a>
</li>
</ul>
</td>
<td width="80%" align="left" valign="top">
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="About this Document"><strong>About this Document</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
This document is intended as a "getting started" guide to installing and running the
Jakarta Lucene web application demo. This guide assumes that you have read the
information in the previous two examples or already know it anyhow. We'll use
Tomcat 4.0.1 as our reference web container. These demos should work with nearly
any container, but it is up to you to adapt them appropriately.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="About the Demos"><strong>About the Demos</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
The Lucene Web Application demo is a template web application intended for deployment
on Tomcat or a similar web container. It's NOT designed as a "best practices"
implementation by ANY means. Its more of a "hello world" type Lucene Web App.
The purpose of this application is to demonstrate Lucene. With that being said,
it should be relatively simple to create a small searchable website in Tomcat or
a similar application server.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Indexing Files"><strong>Indexing Files</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
Once you've gotten this far you're probably itching to go.
Let's start by creating the index you'll need for the web examples.
Since you've already set your classpath in the previous examples,
all you need to do is type
<b> "java org.apache.lucene.demo.IndexHTML -create -index {index-dir} .."</b>.
You'll need to do this from your {tomcat}/webapps/luceneweb directory. {index-dir}
should be a directory that Tomcat has permission to read and write, but is
outside of a web accessible context. By default the webapp is configured
to look in <b>/opt/lucene/index</b> for this index.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Deploying the Demos"><strong>Deploying the Demos</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>Located in your distribution directory you should see
a war file called luceneweb.war. Copy this to your
{tomcat-home}/webapps directory. You may need to restart
Tomcat. </p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Configuration"><strong>Configuration</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
From your Tomcat directory look in the webapps/luceneweb subdirectory. If its not
present, try browsing to "http://localhost:8080/luceneweb" then look again.
Edit a file called configuration.jsp. Ensure that the indexLocation is equal to the
location you used for your index. You may also customize the appTitle and appFooter
strings as you see fit. Once you have finsihed altering the configuration you should
restart Tomcat. You may also wish to update the war file by typing
<b>jar -uf luceneweb.war configuration.jsp</b> from the luceneweb subdirectory.
(The u option is not available in all versions of jar. In this case recreate the war file).
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Running the Demos"><strong>Running the Demos</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>Now you're ready to roll. In your browser set the url to "http://localhost:8080/luceneweb"
enter "test" and the number of items per page and press search.</p>
<p>You should now be looking either at a number of results (provided you didn't erase the
Tomcat examples) or nothing. Try other search terms. Depending on the number of items
per page you set and results returned, there may be a link at the bottom that says "more results&gt;&gt;",
clicking it goes to subsequent pages. If you get an error regarding opening the index, then you
probably set the path in "configuration" incorrectly or Tomcat doesn't have permissions to the
index (or you skipped the step of creating it).</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="About the code..."><strong>About the code...</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
If you want to know more about how this web app works or how to customize it then
<a href="demo4.html">read on&gt;&gt;&gt;</a>.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
</td>
</tr>
<!-- FOOTER -->
<tr><td colspan="2">
<hr noshade="" size="1"/>
</td></tr>
<tr><td colspan="2">
<div align="center"><font color="#525D76" size="-1"><em>
Copyright &#169; 1999-2002, Apache Software Foundation
</em></font></div>
</td></tr>
</table>
</body>
</html>
<!-- end the processing -->

323
docs/demo4.html Normal file
View File

@ -0,0 +1,323 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<!-- Content Stylesheet for Site -->
<!-- start the processing -->
<!-- ====================================================================== -->
<!-- Main Page Section -->
<!-- ====================================================================== -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
<meta name="author" value="Andrew C. Oliver">
<meta name="email" value="acoliver@apache.org">
<title>Jakarta Lucene - Jakarta Lucene - Basic Demo Sources Walkthrough</title>
</head>
<body bgcolor="#ffffff" text="#000000" link="#525D76">
<table border="0" width="100%" cellspacing="0">
<!-- TOP IMAGE -->
<tr>
<td align="left">
<a href="http://jakarta.apache.org"><img src="http://jakarta.apache.org/images/jakarta-logo.gif" border="0"/></a>
</td>
<td align="right">
<a href="http://jakarta.apache.org/lucene/"><img src="./images/lucene_green_300.gif" alt="Jakarta Lucene" border="0"/></a>
</td>
</tr>
</table>
<table border="0" width="100%" cellspacing="4">
<tr><td colspan="2">
<hr noshade="" size="1"/>
</td></tr>
<tr>
<!-- LEFT SIDE NAVIGATION -->
<td width="20%" valign="top" nowrap="true">
<p><strong>About</strong></p>
<ul>
<li> <a href="./index.html">Overview</a>
</li>
<li> <a href="./powered.html">Powered by Lucene</a>
</li>
<li> <a href="./whoweare.html">Who We Are</a>
</li>
<li> <a href="http://jakarta.apache.org/site/mail.html">Mailing Lists</a>
</li>
</ul>
<p><strong>Resources</strong></p>
<ul>
<li> <a href="http://www.lucene.com/cgi-bin/faq/faqmanager.cgi">FAQ (Official)</a>
</li>
<li> <a href="./gettingstarted.html">Getting Started</a>
</li>
<li> <a href="http://www.jguru.com/faq/Lucene">JGuru FAQ</a>
</li>
<li> <a href="http://jakarta.apache.org/site/bugs.html">Bugs</a>
</li>
<li> <a href="http://nagoya.apache.org/bugzilla/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&email1=&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailreporter2=1&bugidtype=include&bug_id=&changedin=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&product=Lucene&short_desc=&short_desc_type=allwordssubstr&long_desc=&long_desc_type=allwordssubstr&bug_file_loc=&bug_file_loc_type=allwordssubstr&keywords=&keywords_type=anywords&field0-0-0=noop&type0-0-0=noop&value0-0-0=&cmdtype=doit&order=%27Importance%27">Lucene Bugs</a>
</li>
<li> <a href="./resources.html">Articles</a>
</li>
<li> <a href="./api/index.html">Javadoc</a>
</li>
<li> <a href="./contributions.html">Contributions</a>
</li>
</ul>
<p><strong>Download</strong></p>
<ul>
<li> <a href="http://jakarta.apache.org/site/binindex.html">Binaries</a>
</li>
<li> <a href="http://jakarta.apache.org/site/sourceindex.html">Source Code</a>
</li>
<li> <a href="http://jakarta.apache.org/site/cvsindex.html">CVS Repositories</a>
</li>
</ul>
<p><strong>Jakarta</strong></p>
<ul>
<li> <a href="http://jakarta.apache.org/site/getinvolved.html">Get Involved</a>
</li>
<li> <a href="http://jakarta.apache.org/site/acknowledgements.html">Acknowledgements</a>
</li>
<li> <a href="http://jakarta.apache.org/site/contact.html">Contact</a>
</li>
<li> <a href="http://jakarta.apache.org/site/legal.html">Legal</a>
</li>
</ul>
</td>
<td width="80%" align="left" valign="top">
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="About the Code"><strong>About the Code</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
In this section we walk through the sources behind the basic Lucene Web Application demo.
Where to find it, its parts, and their function. This section is intended for Java developers
wishing to understand how to use Jakarta Lucene in their applications or for those involved
in deploying web applications based on Lucene.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Location of the source (developers/deployers)"><strong>Location of the source (developers/deployers)</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
Relative the directory created when you extracted Lucene or retreived it from CVS, you
should see a directory called "src" which in turn contains a directory called "jsp".
This is the root for all of the Lucene web demo.
</p>
<p>
Within this directory you should see the index.jsp class. Bring this up in vi or your
editor of choice.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="index.jsp (developers/deployers)"><strong>index.jsp (developers/deployers)</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
This jsp page is pretty boring by itself. All it does is include a header, display a form and
include a footer. If you look at the form, it has two fields: query (where you enter your
search criteria) and maxresults where you specify the number of results per page. If you look
at the form tag, you'll notice it uses the get method as opposed to the post. While this is
considered deprecated functionality by the latest w3c specs, its unlikely to go away due to the
usefulness of being able to bookmark things like searches. By the structure of this JSP it should
be easy to customize it without even editing this particular file. You could simply change the
header and footer. Let's look at the header.jsp (located in the same directory) next.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="header.jsp (developers/deployers)"><strong>header.jsp (developers/deployers)</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
The header is also very simple by itself. The only thing it does is include the configuration.jsp
(which you looked at in the last section of this guide) and set the title and a brief header. This
would be a good place to put your own custom HTML to "pretty" things up a bit. We won't cover the
footer because all it does is display the footer and close your tags. Let's look at the results.jsp,
the meat of this application next.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="results.jsp (developers)"><strong>results.jsp (developers)</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
The results.jsp had a lot more functionality. Much of it is for paging the search results we'll not
cover this as its commented well enough. It does not peform any optimizations such as caching results,
etc. as that would make this a more complex example. The first thing in this page is the actual imports
for the Lucene classes and Lucene demo classes. These classes are loaded from the jars included in the
WEB-INF/lib directory in the final war file.
</p>
<p>
You'll notice that this file includes the same header and footer as the "index.jsp". From there the jsp
constructs an IndexSearcher with the "indexLocation" that was specified in the "configuration.jsp". If there
is an error of any kind in opening the index, it is diplayed ot the user and a boolean flag is set to tell
the rest of the sections of the jsp not to continue.
</p>
<p>
From there, this jsp attempts to get the search criteria, the start index (used for paging) and the maximum
number of results per page. If the maximum results per page is not set or not valid then it and the
start index are set to default values. If only the start index is invalid it is set to a default value. If
the criteria isn't provided then a servlet error is thrown (it is assumed that this is the result of url tampering
or some form of browser malfunction).
</p>
<p>
The jsp moves on to construct a StandardAnalyzer just as in the simple demo, to analyze the search critieria, it
is passed to the QueryParser along with the criteria to construct a Query object. You'll also notice the
string literal "contents" included. This is to specify the search should include the the contents and not
the title, url or some other field in the indexed documents. If there is any error in constructing a Query
object an error is displayed to the user.
</p>
<p>
In the next section of the jsp the IndexSearcher is asked to search given the query object. the results are
returned in a collection called "hits". If the length property of the hits collection is 0 then an error
is displayed to the user and the error flag is set.
</p>
<p>
Finally the jsp iterates through the hits collection and displayed properties of the "Document" objects we talked
about in the first walkthrough. These objects contain "known" fields specific to their indexer (in this case
"IndexHTML" constructs a document with "url", "title" and "contents"). You'll notice that these results are paged
but the search is repeated every time. This is an area where optimization could improve performance for large
result sets.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="More sources (developers)"><strong>More sources (developers)</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
There are additional sources used by the web app that were not specifically covered by either walkthrough. For
example the HTML parser, the IndexHTML class and HTMLDocument class. These are very similar to the classes
covered in the first example, however they have properties sepecific to parsing and indexing HTML. This is
beyond our scope; however, by now you should feel like you're "getting started" with Lucene.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Where to go from here? (Everyone!)"><strong>Where to go from here? (Everyone!)</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
There are a number of things this demo doesn't do or doesn't do quite right. For instance, you may
have noticed that documents in the root context are unreachable (unless you reconfigure Tomcat to
support that context or redirect to it), anywhere where the directory doesn't quite match the context mapping,
you'll have a broken link in your results. If you want to index non-local files or have some other
needs this isn't supported, plus there may be security issues with running the indexing application from
your webapps directory. There are a number of things left for you the implementor or developer to do.
</p>
<p>
In time some of these things may be added to Lucene as features (if you've got a good idea we'd love to hear it!),
but for now: this is where you begin and the search engine/indexer ends. Lastly, one would assume you'd
want to follow the above advice and customize the application to look a little more fancy than black on
white with "Lucene Template" at the top. We'll see you on the Lucene Users' or Developers' mailing lists!
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="When to contact the Author"><strong>When to contact the Author</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
Please resist the urge to contact the authors of this document (without bribes of fame and fortune attached). First
contact the <a href="http://jakarta.apache.org/site/mail.html">mailing lists</a>. That being said feedback,
and modifications to this document and samples are ever so greatly appreciatedThey are just best sent to the
lists so that everyone can share in them. Certainly you'll get the most help there as well.
Thanks for understanding.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
</td>
</tr>
<!-- FOOTER -->
<tr><td colspan="2">
<hr noshade="" size="1"/>
</td></tr>
<tr><td colspan="2">
<div align="center"><font color="#525D76" size="-1"><em>
Copyright &#169; 1999-2002, Apache Software Foundation
</em></font></div>
</td></tr>
</table>
</body>
</html>
<!-- end the processing -->

240
docs/gettingstarted.html Normal file
View File

@ -0,0 +1,240 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<!-- Content Stylesheet for Site -->
<!-- start the processing -->
<!-- ====================================================================== -->
<!-- Main Page Section -->
<!-- ====================================================================== -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
<meta name="author" value="Andrew C. Oliver">
<meta name="email" value="acoliver@apache.org">
<title>Jakarta Lucene - Jakarta Lucene - Getting Started Guide</title>
</head>
<body bgcolor="#ffffff" text="#000000" link="#525D76">
<table border="0" width="100%" cellspacing="0">
<!-- TOP IMAGE -->
<tr>
<td align="left">
<a href="http://jakarta.apache.org"><img src="http://jakarta.apache.org/images/jakarta-logo.gif" border="0"/></a>
</td>
<td align="right">
<a href="http://jakarta.apache.org/lucene/"><img src="./images/lucene_green_300.gif" alt="Jakarta Lucene" border="0"/></a>
</td>
</tr>
</table>
<table border="0" width="100%" cellspacing="4">
<tr><td colspan="2">
<hr noshade="" size="1"/>
</td></tr>
<tr>
<!-- LEFT SIDE NAVIGATION -->
<td width="20%" valign="top" nowrap="true">
<p><strong>About</strong></p>
<ul>
<li> <a href="./index.html">Overview</a>
</li>
<li> <a href="./powered.html">Powered by Lucene</a>
</li>
<li> <a href="./whoweare.html">Who We Are</a>
</li>
<li> <a href="http://jakarta.apache.org/site/mail.html">Mailing Lists</a>
</li>
</ul>
<p><strong>Resources</strong></p>
<ul>
<li> <a href="http://www.lucene.com/cgi-bin/faq/faqmanager.cgi">FAQ (Official)</a>
</li>
<li> <a href="./gettingstarted.html">Getting Started</a>
</li>
<li> <a href="http://www.jguru.com/faq/Lucene">JGuru FAQ</a>
</li>
<li> <a href="http://jakarta.apache.org/site/bugs.html">Bugs</a>
</li>
<li> <a href="http://nagoya.apache.org/bugzilla/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&email1=&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailreporter2=1&bugidtype=include&bug_id=&changedin=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&product=Lucene&short_desc=&short_desc_type=allwordssubstr&long_desc=&long_desc_type=allwordssubstr&bug_file_loc=&bug_file_loc_type=allwordssubstr&keywords=&keywords_type=anywords&field0-0-0=noop&type0-0-0=noop&value0-0-0=&cmdtype=doit&order=%27Importance%27">Lucene Bugs</a>
</li>
<li> <a href="./resources.html">Articles</a>
</li>
<li> <a href="./api/index.html">Javadoc</a>
</li>
<li> <a href="./contributions.html">Contributions</a>
</li>
</ul>
<p><strong>Download</strong></p>
<ul>
<li> <a href="http://jakarta.apache.org/site/binindex.html">Binaries</a>
</li>
<li> <a href="http://jakarta.apache.org/site/sourceindex.html">Source Code</a>
</li>
<li> <a href="http://jakarta.apache.org/site/cvsindex.html">CVS Repositories</a>
</li>
</ul>
<p><strong>Jakarta</strong></p>
<ul>
<li> <a href="http://jakarta.apache.org/site/getinvolved.html">Get Involved</a>
</li>
<li> <a href="http://jakarta.apache.org/site/acknowledgements.html">Acknowledgements</a>
</li>
<li> <a href="http://jakarta.apache.org/site/contact.html">Contact</a>
</li>
<li> <a href="http://jakarta.apache.org/site/legal.html">Legal</a>
</li>
</ul>
</td>
<td width="80%" align="left" valign="top">
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="About this Document"><strong>About this Document</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
This document is intended as a "getting started" guide. It has three basic
audiences: novices looking to install Jakarta Lucene on their application or
web server, developers looking to modify or base the applications they develop
on Lucene, and developers looking to become involved in and contribute to the
development of Lucene. This document is written in tutorial and walkthrough
format. It intends to help you in "getting started", but does not go into great
depth into some of the conceptual or inner details of Jakarta Lucene.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Format of this Guide"><strong>Format of this Guide</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
Each section listed below builds on one another. That being said more advanced users may
wish to skip sections.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="The Simple Demo"><strong>The Simple Demo</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
In <a href="demo.html">this</a> section we walk through the basic Lucene demo and executing it.
This section is intended for anyone who wants a basic background on using the provided Lucene demos.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Simple Demo Source Walkthrough"><strong>Simple Demo Source Walkthrough</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
In <a href="demo2.html">this</a> section we walk through the sources and implementation
for the basic Lucene demo. This section is intended for developers.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Template Web Application"><strong>Template Web Application</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
In <a href="demo3.html">this</a> section we walk through installing
and configuring the template web application. While this walkthough assumes
Tomcat 4.0.x as your container of choice, there is no reason you can't (provided you have
the requisite knowledge) adapt the instructions to your container. This section is intended
for those responsible for the development or deployment of Lucene-based web applications.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Template Web Application sources"><strong>Template Web Application sources</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>
In <a href="demo4.html">this</a> section we walk through the sources used to construct the
template web application. Please note the template application is designed to highlight
features of Lucene and is <b>not</b> an example of best practices. (One would hopefully
use MVC architecture such as provided by Jakarta Struts and taglibs, or better yet XML
with stylesheets, but showing you how to do that would be WAY beyond the scope of this
demonstration. Additionally once could cache results, and perform other performance
optimizations, but those are beyond the scope of this demo).
</p>
<p>
This section is intended for developers and those wishing to customize the template web
application to their needs. The sections useful to developers only are clearly delineated.
</p>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table>
</td>
</tr>
<!-- FOOTER -->
<tr><td colspan="2">
<hr noshade="" size="1"/>
</td></tr>
<tr><td colspan="2">
<div align="center"><font color="#525D76" size="-1"><em>
Copyright &#169; 1999-2002, Apache Software Foundation
</em></font></div>
</td></tr>
</table>
</body>
</html>
<!-- end the processing -->