David Spencer
1d68f8c88d
Logic ignored stop words were in a early version of this code but it was taken out in the belief that there
...
was no point in explicitly looking for them as the scoring algorithm would effictively ignore them.
I did a test and indexed 700 pages on a corporate web site and then ran the MoreLikeThis code on them
and 1/2 of the docs had stop words identified as interesting.
So - I added code in to ignore stop words, but make it backward compatible so that by default this code
is not used.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169512 13f79535-47bb-0310-9956-ffa450edef68
2005-05-10 19:29:56 +00:00
David Spencer
81087e8bb6
Touchup javadoc.
...
Make retrieveInterestingTerms only return the top terms, not all terms.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169511 13f79535-47bb-0310-9956-ffa450edef68
2005-05-10 19:10:28 +00:00
David Spencer
175cf8a9fd
[1] Added comments to retrieveTerms() to document the return value.
...
[2] Added convenience routine retrieveInterestingTerms() which makes it easier to get at the "interesting words" in a document.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169508 13f79535-47bb-0310-9956-ffa450edef68
2005-05-10 18:49:43 +00:00
David Spencer
c696188668
don't print out summary unless it's present
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169366 13f79535-47bb-0310-9956-ffa450edef68
2005-05-09 21:37:50 +00:00
David Spencer
7f8bf69311
cleanup deprecated warnings so it compiles cleanly w/ the current lucene code, lucene-core-1.9-rc1-dev.jar
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169365 13f79535-47bb-0310-9956-ffa450edef68
2005-05-09 21:36:22 +00:00
David Spencer
c680751f63
test checkin of README, just to verify my permissions
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169349 13f79535-47bb-0310-9956-ffa450edef68
2005-05-09 19:25:40 +00:00
Erik Hatcher
78dbe41805
prefix all JARs with lucene-
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168986 13f79535-47bb-0310-9956-ffa450edef68
2005-05-06 23:43:54 +00:00
Erik Hatcher
e8c90fb050
rename WordNet to wordnet, required intermediate move due to OS case insensitivity
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168480 13f79535-47bb-0310-9956-ffa450edef68
2005-05-06 00:32:00 +00:00
Erik Hatcher
5fd5169a6f
temporary move to lowercase WordNet
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168479 13f79535-47bb-0310-9956-ffa450edef68
2005-05-06 00:31:11 +00:00
Erik Hatcher
dd472377dd
adjust code to fix compile/javadoc errors on JDK 1.5
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168478 13f79535-47bb-0310-9956-ffa450edef68
2005-05-06 00:26:08 +00:00
Erik Hatcher
a12dac37b4
adjust project names for consistency
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168476 13f79535-47bb-0310-9956-ffa450edef68
2005-05-06 00:24:18 +00:00
Mark Harwood
12a91b4395
Fixed bug where docs larger than maxDocBytesToAnalyze would cause last fragment to be sized as remainder of doc (which could be huge).
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168452 13f79535-47bb-0310-9956-ffa450edef68
2005-05-05 22:40:45 +00:00
Erik Hatcher
8f70c09b9b
Wolfgang is non-stop with the additions. Easy enough to paste in, so here it is with a Collection-based TokenStream
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@168029 13f79535-47bb-0310-9956-ffa450edef68
2005-05-04 00:24:17 +00:00
Erik Hatcher
f94ebdb41e
applied norm caching path from Wolfgang
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@167958 13f79535-47bb-0310-9956-ffa450edef68
2005-05-03 19:01:58 +00:00
Erik Hatcher
2a37a3e820
Apply wolfgangs fix to the tests
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@167835 13f79535-47bb-0310-9956-ffa450edef68
2005-05-03 00:33:27 +00:00
Andreas Vajda
572633f8c4
- reworked store I/O to use new IndexInput and IndexOutput classes
...
- reworked store I/O to avoid upstream buffering giving better txn control
- added DbStoreTest unit test adapted from StoreTest
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165674 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 20:06:00 +00:00
Erik Hatcher
8f9e2a15e7
Enhancement #34585 - high-performance in-memory index contributed by Wolfgang Hoschek
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165606 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 09:04:07 +00:00
Erik Hatcher
c3847f26ea
overhaul of build system to facilitate building and packaging of contrib sub-projects. some work still to be done, but core Lucene build still working fine
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165566 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 00:11:11 +00:00
Erik Hatcher
21431112fe
adjust license headers to be ASL 2.0
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165565 13f79535-47bb-0310-9956-ffa450edef68
2005-05-02 00:08:04 +00:00
Erik Hatcher
df52ba1ec6
standardizing source layout
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165562 13f79535-47bb-0310-9956-ffa450edef68
2005-05-01 23:57:31 +00:00
Erik Hatcher
f56d33e2d4
Add ASL header - sorry for the oversight on this.
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165559 13f79535-47bb-0310-9956-ffa450edef68
2005-05-01 22:57:39 +00:00
Andreas Vajda
77130721ce
- replaced db.jar with db-4.3.27.jar
...
- downloading db-4.3.27.jar from http://downloads.osafoundation.org/db
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@165319 13f79535-47bb-0310-9956-ffa450edef68
2005-04-29 17:33:27 +00:00
Erik Hatcher
d9042b00d8
move PrecedenceQueryParser to contrib/misc until the kinks are worked out
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@164964 13f79535-47bb-0310-9956-ffa450edef68
2005-04-27 09:32:33 +00:00
Erik Hatcher
7b8f43ec7c
move misc over to official contrib area
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@164963 13f79535-47bb-0310-9956-ffa450edef68
2005-04-27 09:16:31 +00:00
Erik Hatcher
5c9ccb2442
Add Lucene's test classes to contrib test classpath, some tests rely on the utility methods in the core tests
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@164937 13f79535-47bb-0310-9956-ffa450edef68
2005-04-27 01:52:17 +00:00
Erik Hatcher
790dfc1490
javadoc fixup
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@164742 13f79535-47bb-0310-9956-ffa450edef68
2005-04-26 04:41:54 +00:00
Erik Hatcher
26aab23901
add ignores
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@164698 13f79535-47bb-0310-9956-ffa450edef68
2005-04-26 00:30:08 +00:00
Erik Hatcher
d650384d4b
add GreekAnalyzer, contributed by Panagiotis Astithas (past@ebs.gr)
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@164686 13f79535-47bb-0310-9956-ffa450edef68
2005-04-25 23:23:37 +00:00
Erik Hatcher
2fe0a80189
rename misspelled indexDictionnary method
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@160988 13f79535-47bb-0310-9956-ffa450edef68
2005-04-12 00:11:33 +00:00
Erik Hatcher
ec522fc1c8
Fixed deprecation issues, adjusted test cases to use assertEquals better, reformatted style
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@160987 13f79535-47bb-0310-9956-ffa450edef68
2005-04-11 23:48:02 +00:00
Erik Hatcher
0c99b57cc1
Fixed issue with ctor parameter being ignored
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@160984 13f79535-47bb-0310-9956-ffa450edef68
2005-04-11 23:43:57 +00:00
Erik Hatcher
e88213a2d9
refactor build to use common contrib build system
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@160983 13f79535-47bb-0310-9956-ffa450edef68
2005-04-11 23:42:26 +00:00
Daniel Naber
c4f1ee70a9
use lowercase method names; remove javadoc that's inherited anyway
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@160070 13f79535-47bb-0310-9956-ffa450edef68
2005-04-04 17:50:38 +00:00
Daniel Naber
04ea892fbe
import cleanup
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@160065 13f79535-47bb-0310-9956-ffa450edef68
2005-04-04 17:45:36 +00:00
Erik Hatcher
6f5f23444c
enhanced test contributed by Sven. Encoding tweaks
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@160034 13f79535-47bb-0310-9956-ffa450edef68
2005-04-04 12:25:16 +00:00
Erik Hatcher
0ff227ff0a
switch dotted u character to use unicode value reference
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@160023 13f79535-47bb-0310-9956-ffa450edef68
2005-04-04 10:16:37 +00:00
Erik Hatcher
4e580e221e
Issue deprecation warnings when building test cases. Fixed deprecation warnings on TestKeywordAnalyzer
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@160012 13f79535-47bb-0310-9956-ffa450edef68
2005-04-04 09:10:59 +00:00
Erik Hatcher
3be3e8ab5d
Add accent character normalizer filter contributed by Sven Duzont. Also created simple test case.
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@160011 13f79535-47bb-0310-9956-ffa450edef68
2005-04-04 09:10:05 +00:00
Daniel Naber
69380a1815
adapt to use of jline
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@158852 13f79535-47bb-0310-9956-ffa450edef68
2005-03-23 23:49:08 +00:00
Daniel Naber
84db65bfde
adapt to use of jline
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@158851 13f79535-47bb-0310-9956-ffa450edef68
2005-03-23 23:42:23 +00:00
Daniel Naber
5a59714f4a
use jline instead of java-readline. jline can be added to SVN thanks to its BSD license. plus some small cleanup.
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@158850 13f79535-47bb-0310-9956-ffa450edef68
2005-03-23 23:40:50 +00:00
Erik Hatcher
b54f22aaab
Fix max word length issue (though don't know why anyone would limit long words in a more-like-this query).
...
Also, modified to take into account all values of a field rather than just the first one.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@158076 13f79535-47bb-0310-9956-ffa450edef68
2005-03-18 15:03:00 +00:00
Erik Hatcher
1cb674fc04
regenerated from latest Snowball CVS
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@157834 13f79535-47bb-0310-9956-ffa450edef68
2005-03-17 00:41:31 +00:00
Erik Hatcher
9621a0985c
added title to documentation
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@156593 13f79535-47bb-0310-9956-ffa450edef68
2005-03-09 01:59:14 +00:00
Erik Hatcher
9824226394
Contribution of slick Swing models to enable on-the-fly searching of
...
tables and lists. Created by Jonathan Simon.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@156591 13f79535-47bb-0310-9956-ffa450edef68
2005-03-09 01:52:13 +00:00
Mark Harwood
fdf05bd088
Fixed missing fieldname in API
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@154447 13f79535-47bb-0310-9956-ffa450edef68
2005-02-19 19:51:04 +00:00
Daniel Naber
05d0335dcd
offer additional methods that take analyzer + text instead of tokenstream; fix some unused imports and variables
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@154444 13f79535-47bb-0310-9956-ffa450edef68
2005-02-19 19:08:52 +00:00
Daniel Naber
335c1567d8
remove empty "@return" tags so javadoc stops complaining; small whitespace cleanup
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@154083 13f79535-47bb-0310-9956-ffa450edef68
2005-02-16 20:37:57 +00:00
Daniel Naber
45864d1c9c
clean up imports, remove unused variables and remove the declaration of an Exception that was never thrown
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@154080 13f79535-47bb-0310-9956-ffa450edef68
2005-02-16 20:20:15 +00:00
Erik Hatcher
28e712b2ee
update docs to account for TLP migration
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@153802 13f79535-47bb-0310-9956-ffa450edef68
2005-02-14 16:48:47 +00:00
Erik Hatcher
373e613341
remove unnecessary import
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@153430 13f79535-47bb-0310-9956-ffa450edef68
2005-02-11 18:11:37 +00:00
Erik Hatcher
2ac412f6b7
move similarity and spellchecker to new contrib area
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@153429 13f79535-47bb-0310-9956-ffa450edef68
2005-02-11 18:11:05 +00:00
Erik Hatcher
f375d09898
add customizable buffer size
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@153412 13f79535-47bb-0310-9956-ffa450edef68
2005-02-11 15:30:14 +00:00
Erik Hatcher
cd0d0937e1
split keyword tokenizer out of KeywordAnalyzer
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@153398 13f79535-47bb-0310-9956-ffa450edef68
2005-02-11 13:50:37 +00:00
Erik Hatcher
826fef7f6a
KeywordAnalyzer contribution - adapted from _Lucene in Action_ code
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@152921 13f79535-47bb-0310-9956-ffa450edef68
2005-02-08 19:13:05 +00:00
Mark Harwood
276ab079f5
Added Nicko Cadell's Encoder contribution
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@151622 13f79535-47bb-0310-9956-ffa450edef68
2005-02-06 21:31:54 +00:00
Mark Harwood
b1555b0bbf
Test SVN Commit
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@151615 13f79535-47bb-0310-9956-ffa450edef68
2005-02-06 18:12:57 +00:00
Erik Hatcher
0ee1728e6d
move two more projects over to contrib
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@151590 13f79535-47bb-0310-9956-ffa450edef68
2005-02-06 15:35:12 +00:00
Erik Hatcher
646f0f0434
Switch ant project to conventional src/java directory structure
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@151589 13f79535-47bb-0310-9956-ffa450edef68
2005-02-06 14:51:59 +00:00
Erik Hatcher
767312d611
add convenient TODO file to keep track of sandbox -> contrib move
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@151469 13f79535-47bb-0310-9956-ffa450edef68
2005-02-05 02:23:19 +00:00
Erik Hatcher
10904d02f6
fix most deprecation warnings
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@151468 13f79535-47bb-0310-9956-ffa450edef68
2005-02-05 02:21:39 +00:00
Erik Hatcher
0955eef89f
move parts of the sandbox over to contrib area
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@151459 13f79535-47bb-0310-9956-ffa450edef68
2005-02-05 01:25:43 +00:00