David Spencer
1d68f8c88d
Logic ignored stop words were in a early version of this code but it was taken out in the belief that there
...
was no point in explicitly looking for them as the scoring algorithm would effictively ignore them.
I did a test and indexed 700 pages on a corporate web site and then ran the MoreLikeThis code on them
and 1/2 of the docs had stop words identified as interesting.
So - I added code in to ignore stop words, but make it backward compatible so that by default this code
is not used.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169512 13f79535-47bb-0310-9956-ffa450edef68
2005-05-10 19:29:56 +00:00
David Spencer
81087e8bb6
Touchup javadoc.
...
Make retrieveInterestingTerms only return the top terms, not all terms.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169511 13f79535-47bb-0310-9956-ffa450edef68
2005-05-10 19:10:28 +00:00
David Spencer
175cf8a9fd
[1] Added comments to retrieveTerms() to document the return value.
...
[2] Added convenience routine retrieveInterestingTerms() which makes it easier to get at the "interesting words" in a document.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169508 13f79535-47bb-0310-9956-ffa450edef68
2005-05-10 18:49:43 +00:00
Erik Hatcher
a79c508580
#34816 - adjust for contrib/WordNet renaming
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169391 13f79535-47bb-0310-9956-ffa450edef68
2005-05-10 01:19:03 +00:00
David Spencer
c696188668
don't print out summary unless it's present
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169366 13f79535-47bb-0310-9956-ffa450edef68
2005-05-09 21:37:50 +00:00
David Spencer
7f8bf69311
cleanup deprecated warnings so it compiles cleanly w/ the current lucene code, lucene-core-1.9-rc1-dev.jar
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169365 13f79535-47bb-0310-9956-ffa450edef68
2005-05-09 21:36:22 +00:00
David Spencer
c680751f63
test checkin of README, just to verify my permissions
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169349 13f79535-47bb-0310-9956-ffa450edef68
2005-05-09 19:25:40 +00:00
Daniel Naber
129227dce1
throw a more helpful exception if supposed directory is a file
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169136 13f79535-47bb-0310-9956-ffa450edef68
2005-05-08 14:51:29 +00:00
Erik Hatcher
78dbe41805
prefix all JARs with lucene-
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168986 13f79535-47bb-0310-9956-ffa450edef68
2005-05-06 23:43:54 +00:00
Daniel Naber
9f78244f9e
convenience constructors that load list of stop words from a file
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168970 13f79535-47bb-0310-9956-ffa450edef68
2005-05-06 22:28:52 +00:00
Daniel Naber
c3f90ad76e
use non-deprecated API
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168642 13f79535-47bb-0310-9956-ffa450edef68
2005-05-06 19:32:54 +00:00
Daniel Naber
529214394c
remove useless parameter
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168640 13f79535-47bb-0310-9956-ffa450edef68
2005-05-06 19:29:40 +00:00
Erik Hatcher
e8c90fb050
rename WordNet to wordnet, required intermediate move due to OS case insensitivity
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168480 13f79535-47bb-0310-9956-ffa450edef68
2005-05-06 00:32:00 +00:00
Erik Hatcher
5fd5169a6f
temporary move to lowercase WordNet
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168479 13f79535-47bb-0310-9956-ffa450edef68
2005-05-06 00:31:11 +00:00
Erik Hatcher
dd472377dd
adjust code to fix compile/javadoc errors on JDK 1.5
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168478 13f79535-47bb-0310-9956-ffa450edef68
2005-05-06 00:26:08 +00:00
Erik Hatcher
a12dac37b4
adjust project names for consistency
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168476 13f79535-47bb-0310-9956-ffa450edef68
2005-05-06 00:24:18 +00:00
Daniel Naber
170bdc33a3
call static methods via class, not via object (avoids warning in Eclipse)
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168454 13f79535-47bb-0310-9956-ffa450edef68
2005-05-05 22:46:09 +00:00
Daniel Naber
ffbdf0b882
test using a non-existing field as first sort key
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168453 13f79535-47bb-0310-9956-ffa450edef68
2005-05-05 22:41:44 +00:00
Mark Harwood
12a91b4395
Fixed bug where docs larger than maxDocBytesToAnalyze would cause last fragment to be sized as remainder of doc (which could be huge).
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168452 13f79535-47bb-0310-9956-ffa450edef68
2005-05-05 22:40:45 +00:00
Daniel Naber
a20246c68c
don't declare Exceptions that are never thrown; remove an unused variable
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168450 13f79535-47bb-0310-9956-ffa450edef68
2005-05-05 22:37:09 +00:00
Daniel Naber
c97ba92ebd
refactoring so that filename extensions are in one place
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168449 13f79535-47bb-0310-9956-ffa450edef68
2005-05-05 22:20:49 +00:00
Daniel Naber
0209ce959b
don't print to stdout in test cases
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168338 13f79535-47bb-0310-9956-ffa450edef68
2005-05-05 15:22:03 +00:00
Daniel Naber
30fe087036
update build instructions and version numbers
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168332 13f79535-47bb-0310-9956-ffa450edef68
2005-05-05 13:38:34 +00:00
Daniel Naber
4b00637662
only delete our own files when re-creating an index ( #34695 )
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168213 13f79535-47bb-0310-9956-ffa450edef68
2005-05-04 23:34:52 +00:00
Daniel Naber
77f94fb60c
mention the new Java 1.4 requirement that we already agreed on in January
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168212 13f79535-47bb-0310-9956-ffa450edef68
2005-05-04 23:26:00 +00:00
Daniel Naber
0e9579345a
fixing typos; WordNet url update
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168208 13f79535-47bb-0310-9956-ffa450edef68
2005-05-04 23:10:37 +00:00
Erik Hatcher
8f70c09b9b
Wolfgang is non-stop with the additions. Easy enough to paste in, so here it is with a Collection-based TokenStream
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168029 13f79535-47bb-0310-9956-ffa450edef68
2005-05-04 00:24:17 +00:00
Erik Hatcher
f94ebdb41e
applied norm caching path from Wolfgang
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@167958 13f79535-47bb-0310-9956-ffa450edef68
2005-05-03 19:01:58 +00:00
Erik Hatcher
2a37a3e820
Apply wolfgangs fix to the tests
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@167835 13f79535-47bb-0310-9956-ffa450edef68
2005-05-03 00:33:27 +00:00
Andreas Vajda
572633f8c4
- reworked store I/O to use new IndexInput and IndexOutput classes
...
- reworked store I/O to avoid upstream buffering giving better txn control
- added DbStoreTest unit test adapted from StoreTest
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165674 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 20:06:00 +00:00
Daniel Naber
4b1834eeee
sorry, typo in image URL
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165660 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 18:50:45 +00:00
Daniel Naber
4b2d7f3fe0
use non-relative URL for image to make it work in sub directories; remove non-existing stuff from sandbox page
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165659 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 18:47:19 +00:00
Daniel Naber
cfb14e1be8
improve text of exception
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165658 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 18:43:48 +00:00
Erik Hatcher
b01de31134
Add contrib/memory to javadocs, and add imported build files into src distribution
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165616 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 10:00:44 +00:00
Erik Hatcher
9464b37949
remove ignores since artifacts now are built into main directory rather than here
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165615 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 09:53:43 +00:00
Erik Hatcher
8f9e2a15e7
Enhancement #34585 - high-performance in-memory index contributed by Wolfgang Hoschek
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165606 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 09:04:07 +00:00
Erik Hatcher
bc49f328c6
aggregate duplicated distribution patterns into reusable patternsets
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165571 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 01:06:14 +00:00
Erik Hatcher
fe95807ca8
belated checkin - moved deprecated build/test targets to separate easily removable import build file
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165569 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 00:53:41 +00:00
Erik Hatcher
eb50b47c8b
add contrib pieces to distribution files
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165568 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 00:51:18 +00:00
Erik Hatcher
c3847f26ea
overhaul of build system to facilitate building and packaging of contrib sub-projects. some work still to be done, but core Lucene build still working fine
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165566 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 00:11:11 +00:00
Erik Hatcher
21431112fe
adjust license headers to be ASL 2.0
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165565 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 00:08:04 +00:00
Erik Hatcher
df52ba1ec6
standardizing source layout
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165562 13f79535-47bb-0310-9956-ffa450edef68
2005-05-01 23:57:31 +00:00
Erik Hatcher
f56d33e2d4
Add ASL header - sorry for the oversight on this.
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165559 13f79535-47bb-0310-9956-ffa450edef68
2005-05-01 22:57:39 +00:00
Daniel Naber
b8dfd507eb
whitespace cleanup only (no more tabs/spaces mix)
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165552 13f79535-47bb-0310-9956-ffa450edef68
2005-05-01 22:04:24 +00:00
Daniel Naber
e8fd6b347c
remove non-existing projects and fix a link
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165509 13f79535-47bb-0310-9956-ffa450edef68
2005-05-01 14:24:51 +00:00
Daniel Naber
d087df635f
move resource page to the wiki to avoid content duplication
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165508 13f79535-47bb-0310-9956-ffa450edef68
2005-05-01 14:07:36 +00:00
Daniel Naber
db8246f137
forgot to commit these files
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165500 13f79535-47bb-0310-9956-ffa450edef68
2005-05-01 13:09:36 +00:00
Daniel Naber
9c3bd9ca86
import cleanup
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165484 13f79535-47bb-0310-9956-ffa450edef68
2005-05-01 11:59:02 +00:00
Erik Hatcher
acf2b4c60c
Remove outdated sandbox code
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165365 13f79535-47bb-0310-9956-ffa450edef68
2005-04-30 00:07:27 +00:00
Daniel Naber
f848854278
fixing property
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165355 13f79535-47bb-0310-9956-ffa450edef68
2005-04-29 22:58:33 +00:00