mirror of https://github.com/apache/lucene.git
mostly spelling fixes, some small clarifications
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@150623 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
3c4f929368
commit
fc493d9b08
|
@ -188,7 +188,7 @@ The first substantial thing the main function does is instantiate an instance
|
|||
of IndexWriter. It passes a string called "index" and a new instance of a class called
|
||||
"StandardAnalyzer". The "index" string is the name of the directory that all index information
|
||||
should be stored in. Because we're not passing any path information, one must assume this
|
||||
will be created as a subdirectory of the current directory (if does not already exist). On
|
||||
will be created as a subdirectory of the current directory (if it does not already exist). On
|
||||
some platforms this may actually result in it being created in other directories (such as
|
||||
the user's home directory).
|
||||
</p>
|
||||
|
@ -199,18 +199,18 @@ exist it will create it, otherwise it will refresh the index living at that path
|
|||
must a also pass an instance of <b>org.apache.analysis.Analyzer</b>.
|
||||
</p>
|
||||
<p>
|
||||
The <b>Analyzer</b>, in this case, the <b>Stop Analyzer</b> is little more than a standard Java
|
||||
Tokenizer, converting all strings to lowercase and filtering out useless words from the index.
|
||||
By useless words I mean common language words such as articles (a,an,the) and other words that
|
||||
would be useless for searching. It should be noted that there are different rules for every
|
||||
language, and you should use the proper analyzer for each. Lucene currently provides Analyzers
|
||||
for English and German.
|
||||
The <b>Analyzer</b>, in this case, the <b>StandardAnalyzer</b> is little more than a standard Java
|
||||
Tokenizer, converting all strings to lowercase and filtering out useless words and characters from the index.
|
||||
By useless words and characters I mean common language words such as articles (a, an, the, etc.) and other
|
||||
strings that would be useless for searching (e.g. <b>'s</b>) . It should be noted that there are different
|
||||
rules for every language, and you should use the proper analyzer for each. Lucene currently
|
||||
provides Analyzers for English and German, more can be found in the Lucene Sandbox.
|
||||
</p>
|
||||
<p>
|
||||
Looking down further in the file, you should see the indexDocs() code. This recursive function
|
||||
simply crawls the directories and uses FileDocument to create Document objects. The Document
|
||||
is simply a data object to represent the content in the file as well as its creation time and
|
||||
location. These instances are added to the indexWriter. Take a look inside FileDocument. Its
|
||||
location. These instances are added to the indexWriter. Take a look inside FileDocument. It's
|
||||
not particularly complicated, it just adds fields to the Document.
|
||||
</p>
|
||||
<p>
|
||||
|
|
|
@ -224,7 +224,7 @@ the meat of this application next.
|
|||
<blockquote>
|
||||
<p>
|
||||
The results.jsp had a lot more functionality. Much of it is for paging the search results we'll not
|
||||
cover this as its commented well enough. It does not peform any optimizations such as caching results,
|
||||
cover this as it's commented well enough. It does not perform any optimizations such as caching results,
|
||||
etc. as that would make this a more complex example. The first thing in this page is the actual imports
|
||||
for the Lucene classes and Lucene demo classes. These classes are loaded from the jars included in the
|
||||
WEB-INF/lib directory in the final war file.
|
||||
|
@ -232,7 +232,7 @@ WEB-INF/lib directory in the final war file.
|
|||
<p>
|
||||
You'll notice that this file includes the same header and footer as the "index.jsp". From there the jsp
|
||||
constructs an IndexSearcher with the "indexLocation" that was specified in the "configuration.jsp". If there
|
||||
is an error of any kind in opening the index, it is diplayed ot the user and a boolean flag is set to tell
|
||||
is an error of any kind in opening the index, it is diplayed to the user and a boolean flag is set to tell
|
||||
the rest of the sections of the jsp not to continue.
|
||||
</p>
|
||||
<p>
|
||||
|
@ -245,12 +245,12 @@ or some form of browser malfunction).
|
|||
<p>
|
||||
The jsp moves on to construct a StandardAnalyzer just as in the simple demo, to analyze the search critieria, it
|
||||
is passed to the QueryParser along with the criteria to construct a Query object. You'll also notice the
|
||||
string literal "contents" included. This is to specify the search should include the the contents and not
|
||||
string literal "contents" included. This is to specify the search should include the contents and not
|
||||
the title, url or some other field in the indexed documents. If there is any error in constructing a Query
|
||||
object an error is displayed to the user.
|
||||
</p>
|
||||
<p>
|
||||
In the next section of the jsp the IndexSearcher is asked to search given the query object. the results are
|
||||
In the next section of the jsp the IndexSearcher is asked to search given the query object. The results are
|
||||
returned in a collection called "hits". If the length property of the hits collection is 0 then an error
|
||||
is displayed to the user and the error flag is set.
|
||||
</p>
|
||||
|
@ -323,7 +323,7 @@ white with "Lucene Template" at the top. We'll see you on the Lucene Users' or
|
|||
<p>
|
||||
Please resist the urge to contact the authors of this document (without bribes of fame and fortune attached). First
|
||||
contact the <a href="http://jakarta.apache.org/site/mail.html">mailing lists</a>. That being said feedback,
|
||||
and modifications to this document and samples are ever so greatly appreciatedThey are just best sent to the
|
||||
and modifications to this document and samples are ever so greatly appreciated. They are just best sent to the
|
||||
lists so that everyone can share in them. Certainly you'll get the most help there as well.
|
||||
Thanks for understanding.
|
||||
</p>
|
||||
|
|
|
@ -38,7 +38,7 @@ The first substantial thing the main function does is instantiate an instance
|
|||
of IndexWriter. It passes a string called "index" and a new instance of a class called
|
||||
"StandardAnalyzer". The "index" string is the name of the directory that all index information
|
||||
should be stored in. Because we're not passing any path information, one must assume this
|
||||
will be created as a subdirectory of the current directory (if does not already exist). On
|
||||
will be created as a subdirectory of the current directory (if it does not already exist). On
|
||||
some platforms this may actually result in it being created in other directories (such as
|
||||
the user's home directory).
|
||||
</p>
|
||||
|
@ -49,18 +49,18 @@ exist it will create it, otherwise it will refresh the index living at that path
|
|||
must a also pass an instance of <b>org.apache.analysis.Analyzer</b>.
|
||||
</p>
|
||||
<p>
|
||||
The <b>Analyzer</b>, in this case, the <b>Stop Analyzer</b> is little more than a standard Java
|
||||
Tokenizer, converting all strings to lowercase and filtering out useless words from the index.
|
||||
By useless words I mean common language words such as articles (a,an,the) and other words that
|
||||
would be useless for searching. It should be noted that there are different rules for every
|
||||
language, and you should use the proper analyzer for each. Lucene currently provides Analyzers
|
||||
for English and German.
|
||||
The <b>Analyzer</b>, in this case, the <b>StandardAnalyzer</b> is little more than a standard Java
|
||||
Tokenizer, converting all strings to lowercase and filtering out useless words and characters from the index.
|
||||
By useless words and characters I mean common language words such as articles (a, an, the, etc.) and other
|
||||
strings that would be useless for searching (e.g. <b>'s</b>) . It should be noted that there are different
|
||||
rules for every language, and you should use the proper analyzer for each. Lucene currently
|
||||
provides Analyzers for English and German, more can be found in the Lucene Sandbox.
|
||||
</p>
|
||||
<p>
|
||||
Looking down further in the file, you should see the indexDocs() code. This recursive function
|
||||
simply crawls the directories and uses FileDocument to create Document objects. The Document
|
||||
is simply a data object to represent the content in the file as well as its creation time and
|
||||
location. These instances are added to the indexWriter. Take a look inside FileDocument. Its
|
||||
location. These instances are added to the indexWriter. Take a look inside FileDocument. It's
|
||||
not particularly complicated, it just adds fields to the Document.
|
||||
</p>
|
||||
<p>
|
||||
|
|
|
@ -54,7 +54,7 @@ the meat of this application next.
|
|||
<section name="results.jsp (developers)">
|
||||
<p>
|
||||
The results.jsp had a lot more functionality. Much of it is for paging the search results we'll not
|
||||
cover this as its commented well enough. It does not peform any optimizations such as caching results,
|
||||
cover this as it's commented well enough. It does not perform any optimizations such as caching results,
|
||||
etc. as that would make this a more complex example. The first thing in this page is the actual imports
|
||||
for the Lucene classes and Lucene demo classes. These classes are loaded from the jars included in the
|
||||
WEB-INF/lib directory in the final war file.
|
||||
|
@ -62,7 +62,7 @@ WEB-INF/lib directory in the final war file.
|
|||
<p>
|
||||
You'll notice that this file includes the same header and footer as the "index.jsp". From there the jsp
|
||||
constructs an IndexSearcher with the "indexLocation" that was specified in the "configuration.jsp". If there
|
||||
is an error of any kind in opening the index, it is diplayed ot the user and a boolean flag is set to tell
|
||||
is an error of any kind in opening the index, it is diplayed to the user and a boolean flag is set to tell
|
||||
the rest of the sections of the jsp not to continue.
|
||||
</p>
|
||||
<p>
|
||||
|
@ -75,12 +75,12 @@ or some form of browser malfunction).
|
|||
<p>
|
||||
The jsp moves on to construct a StandardAnalyzer just as in the simple demo, to analyze the search critieria, it
|
||||
is passed to the QueryParser along with the criteria to construct a Query object. You'll also notice the
|
||||
string literal "contents" included. This is to specify the search should include the the contents and not
|
||||
string literal "contents" included. This is to specify the search should include the contents and not
|
||||
the title, url or some other field in the indexed documents. If there is any error in constructing a Query
|
||||
object an error is displayed to the user.
|
||||
</p>
|
||||
<p>
|
||||
In the next section of the jsp the IndexSearcher is asked to search given the query object. the results are
|
||||
In the next section of the jsp the IndexSearcher is asked to search given the query object. The results are
|
||||
returned in a collection called "hits". If the length property of the hits collection is 0 then an error
|
||||
is displayed to the user and the error flag is set.
|
||||
</p>
|
||||
|
@ -123,7 +123,7 @@ white with "Lucene Template" at the top. We'll see you on the Lucene Users' or
|
|||
<p>
|
||||
Please resist the urge to contact the authors of this document (without bribes of fame and fortune attached). First
|
||||
contact the <a href="http://jakarta.apache.org/site/mail.html">mailing lists</a>. That being said feedback,
|
||||
and modifications to this document and samples are ever so greatly appreciatedThey are just best sent to the
|
||||
and modifications to this document and samples are ever so greatly appreciated. They are just best sent to the
|
||||
lists so that everyone can share in them. Certainly you'll get the most help there as well.
|
||||
Thanks for understanding.
|
||||
</p>
|
||||
|
|
Loading…
Reference in New Issue