mirror of https://github.com/apache/lucene.git
1d68f8c88d
was no point in explicitly looking for them as the scoring algorithm would effictively ignore them. I did a test and indexed 700 pages on a corporate web site and then ran the MoreLikeThis code on them and 1/2 of the docs had stop words identified as interesting. So - I added code in to ignore stop words, but make it backward compatible so that by default this code is not used. git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169512 13f79535-47bb-0310-9956-ffa450edef68 |
||
---|---|---|
.. | ||
src/java/org/apache/lucene/search/similar | ||
.cvsignore | ||
README.txt | ||
build.xml |
README.txt
Document similarity measures. This most significant contribution here is MoreLikeThis, in /src/java/org/apache/lucene/search/similar.