document Christoph's improvements to FuzzyQuery

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@150508 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Daniel Naber 2004-09-14 22:19:53 +00:00
parent 862b9ea4a7
commit 61e338ae88
3 changed files with 33 additions and 3 deletions

View File

@ -11,7 +11,12 @@ $Id$
2. FuzzyQuery now takes an additional parameter that specifies the 2. FuzzyQuery now takes an additional parameter that specifies the
minimum similarity that is required for a term to match the query. minimum similarity that is required for a term to match the query.
Note that this isn't supported by QueryParser yet. (Daniel Naber) The QueryParser syntax for this is term~x, where x is a floating
point number between 0 and 1 (a bigger number means that a higher
similarity is required). Furthermore, a prefix can be specified
for FuzzyQuerys so that only those terms are considered similar that
start with this prefix. This can speed up FuzzyQuery greatly.
(Daniel Naber, Christoph Goller)
3. The Russian and the German analyzers have been moved to Sandbox. 3. The Russian and the German analyzers have been moved to Sandbox.
Also, the WordlistLoader class has been moved one level up in the Also, the WordlistLoader class has been moved one level up in the

View File

@ -385,7 +385,28 @@ limitations under the License.
</tr> </tr>
</table> </table>
</div> </div>
<p>This search will find terms like foam and roams</p> <p>This search will find terms like foam and roams.</p>
<p>Starting with Lucene 1.9 an additional (optional) parameter can specify the required similarity. The value is between 0 and 1, with a value closer to 1 only terms with a higher similarity will be matched. For example:</p>
<div align="left">
<table cellspacing="4" cellpadding="0" border="0">
<tr>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
</tr>
<tr>
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#ffffff"><pre>roam~0.8</pre></td>
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
</tr>
<tr>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
</tr>
</table>
</div>
<p>The default that is used if the parameter is not given is 0.5.</p>
</blockquote> </blockquote>
</td></tr> </td></tr>
<tr><td><br/></td></tr> <tr><td><br/></td></tr>

View File

@ -88,7 +88,11 @@
<p>Lucene supports fuzzy searches based on the Levenshtein Distance, or Edit Distance algorithm. To do a fuzzy search use the tilde, "~", symbol at the end of a Single word Term. For example to search for a term similar in spelling to "roam" use the fuzzy search: </p> <p>Lucene supports fuzzy searches based on the Levenshtein Distance, or Edit Distance algorithm. To do a fuzzy search use the tilde, "~", symbol at the end of a Single word Term. For example to search for a term similar in spelling to "roam" use the fuzzy search: </p>
<source>roam~</source> <source>roam~</source>
<p>This search will find terms like foam and roams</p> <p>This search will find terms like foam and roams.</p>
<p>Starting with Lucene 1.9 an additional (optional) parameter can specify the required similarity. The value is between 0 and 1, with a value closer to 1 only terms with a higher similarity will be matched. For example:</p>
<source>roam~0.8</source>
<p>The default that is used if the parameter is not given is 0.5.</p>
</subsection> </subsection>