Improved documentation.

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149830 13f79535-47bb-0310-9956-ffa450edef68
2002-08-05 17:39:03 +00:00 · 2002-08-05 17:39:03 +00:00 · 91512d5dde
parent efb2e12705
commit 91512d5dde
2 changed files with 28 additions and 9 deletions
--- a/CHANGES.txt
+++ b/CHANGES.txt
@ -58,6 +58,21 @@ $Id$
     for longer fields.  Once the index is re-created, scores will be
     as before. (cutting)

+ 13. Added new method Token.setPositionIncrement().
+
+     This permits, for the purpose of phrase searching, placing
+     multiple terms in a single position.  This is useful with
+     stemmers that produce multiple possible stems for a word.
+
+     This also permits the introduction of gaps between terms, so that
+     terms which are adjacent in a token stream will not be matched by
+     and exact phrase query.  This makes it possible, e.g., to build
+     an analyzer where phrases are not matched over stop words which
+     have been removed.
+
+     Finally, repeating a token with an increment of zero can also be
+     used to boost scores of matches on that token.
+

 1.2 RC6

--- a/src/java/org/apache/lucene/analysis/Token.java
+++ b/src/java/org/apache/lucene/analysis/Token.java
@ -54,6 +54,8 @@ package org.apache.lucene.analysis;
 * <http://www.apache.org/>.
 */

+import org.apache.lucene.index.TermPositions;
+
 /** A Token is an occurence of a term from the text of a field.  It consists of
  a term's text, the start and end offset of the term in the text of the field,
  and a type string.
@ -98,19 +100,21 @@ public final class Token {
   *
   * <p>The default value is one.
   *
-   * <p>Two common uses for this are:<ul>
+   * <p>Some common uses for this are:<ul>
   *
   * <li>Set it to zero to put multiple terms in the same position.  This is
-   * useful if, e.g., when a word has multiple stems.  This way searches for
-   * phrases including either stem will match this occurence.  In this case,
-   * all but the first stem's increment should be set to zero: the increment of
-   * the first instance should be one.
+   * useful if, e.g., a word has multiple stems.  Searches for phrases
+   * including either stem will match.  In this case, all but the first stem's
+   * increment should be set to zero: the increment of the first instance
+   * should be one.  Repeating a token with an increment of zero can also be
+   * used to boost the scores of matches on that token.
   *
   * <li>Set it to values greater than one to inhibit exact phrase matches.
-   * If, for example, one does not want phrases to match across stop words,
-   * then one could build a stop word filter that removes stop words and also
-   * sets the increment to the number of stop words removed before each
-   * non-stop word.
+   * If, for example, one does not want phrases to match across removed stop
+   * words, then one could build a stop word filter that removes stop words and
+   * also sets the increment to the number of stop words removed before each
+   * non-stop word.  Then exact phrase queries will only match when the terms
+   * occur with no intervening stop words.
   *
   * </ul>
   * @see TermPositions