mirror of https://github.com/apache/lucene.git
Improved documentation.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149830 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
efb2e12705
commit
91512d5dde
15
CHANGES.txt
15
CHANGES.txt
|
@ -58,6 +58,21 @@ $Id$
|
|||
for longer fields. Once the index is re-created, scores will be
|
||||
as before. (cutting)
|
||||
|
||||
13. Added new method Token.setPositionIncrement().
|
||||
|
||||
This permits, for the purpose of phrase searching, placing
|
||||
multiple terms in a single position. This is useful with
|
||||
stemmers that produce multiple possible stems for a word.
|
||||
|
||||
This also permits the introduction of gaps between terms, so that
|
||||
terms which are adjacent in a token stream will not be matched by
|
||||
and exact phrase query. This makes it possible, e.g., to build
|
||||
an analyzer where phrases are not matched over stop words which
|
||||
have been removed.
|
||||
|
||||
Finally, repeating a token with an increment of zero can also be
|
||||
used to boost scores of matches on that token.
|
||||
|
||||
|
||||
1.2 RC6
|
||||
|
||||
|
|
|
@ -54,6 +54,8 @@ package org.apache.lucene.analysis;
|
|||
* <http://www.apache.org/>.
|
||||
*/
|
||||
|
||||
import org.apache.lucene.index.TermPositions;
|
||||
|
||||
/** A Token is an occurence of a term from the text of a field. It consists of
|
||||
a term's text, the start and end offset of the term in the text of the field,
|
||||
and a type string.
|
||||
|
@ -98,19 +100,21 @@ public final class Token {
|
|||
*
|
||||
* <p>The default value is one.
|
||||
*
|
||||
* <p>Two common uses for this are:<ul>
|
||||
* <p>Some common uses for this are:<ul>
|
||||
*
|
||||
* <li>Set it to zero to put multiple terms in the same position. This is
|
||||
* useful if, e.g., when a word has multiple stems. This way searches for
|
||||
* phrases including either stem will match this occurence. In this case,
|
||||
* all but the first stem's increment should be set to zero: the increment of
|
||||
* the first instance should be one.
|
||||
* useful if, e.g., a word has multiple stems. Searches for phrases
|
||||
* including either stem will match. In this case, all but the first stem's
|
||||
* increment should be set to zero: the increment of the first instance
|
||||
* should be one. Repeating a token with an increment of zero can also be
|
||||
* used to boost the scores of matches on that token.
|
||||
*
|
||||
* <li>Set it to values greater than one to inhibit exact phrase matches.
|
||||
* If, for example, one does not want phrases to match across stop words,
|
||||
* then one could build a stop word filter that removes stop words and also
|
||||
* sets the increment to the number of stop words removed before each
|
||||
* non-stop word.
|
||||
* If, for example, one does not want phrases to match across removed stop
|
||||
* words, then one could build a stop word filter that removes stop words and
|
||||
* also sets the increment to the number of stop words removed before each
|
||||
* non-stop word. Then exact phrase queries will only match when the terms
|
||||
* occur with no intervening stop words.
|
||||
*
|
||||
* </ul>
|
||||
* @see TermPositions
|
||||
|
|
Loading…
Reference in New Issue