Improved documentation.

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149830 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Doug Cutting 2002-08-05 17:39:03 +00:00
parent efb2e12705
commit 91512d5dde
2 changed files with 28 additions and 9 deletions

View File

@ -58,6 +58,21 @@ $Id$
for longer fields. Once the index is re-created, scores will be
as before. (cutting)
13. Added new method Token.setPositionIncrement().
This permits, for the purpose of phrase searching, placing
multiple terms in a single position. This is useful with
stemmers that produce multiple possible stems for a word.
This also permits the introduction of gaps between terms, so that
terms which are adjacent in a token stream will not be matched by
and exact phrase query. This makes it possible, e.g., to build
an analyzer where phrases are not matched over stop words which
have been removed.
Finally, repeating a token with an increment of zero can also be
used to boost scores of matches on that token.
1.2 RC6

View File

@ -54,6 +54,8 @@ package org.apache.lucene.analysis;
* <http://www.apache.org/>.
*/
import org.apache.lucene.index.TermPositions;
/** A Token is an occurence of a term from the text of a field. It consists of
a term's text, the start and end offset of the term in the text of the field,
and a type string.
@ -98,19 +100,21 @@ public final class Token {
*
* <p>The default value is one.
*
* <p>Two common uses for this are:<ul>
* <p>Some common uses for this are:<ul>
*
* <li>Set it to zero to put multiple terms in the same position. This is
* useful if, e.g., when a word has multiple stems. This way searches for
* phrases including either stem will match this occurence. In this case,
* all but the first stem's increment should be set to zero: the increment of
* the first instance should be one.
* useful if, e.g., a word has multiple stems. Searches for phrases
* including either stem will match. In this case, all but the first stem's
* increment should be set to zero: the increment of the first instance
* should be one. Repeating a token with an increment of zero can also be
* used to boost the scores of matches on that token.
*
* <li>Set it to values greater than one to inhibit exact phrase matches.
* If, for example, one does not want phrases to match across stop words,
* then one could build a stop word filter that removes stop words and also
* sets the increment to the number of stop words removed before each
* non-stop word.
* If, for example, one does not want phrases to match across removed stop
* words, then one could build a stop word filter that removes stop words and
* also sets the increment to the number of stop words removed before each
* non-stop word. Then exact phrase queries will only match when the terms
* occur with no intervening stop words.
*
* </ul>
* @see TermPositions