lucene

History

David Spencer 1d68f8c88d Logic ignored stop words were in a early version of this code but it was taken out in the belief that there was no point in explicitly looking for them as the scoring algorithm would effictively ignore them. I did a test and indexed 700 pages on a corporate web site and then ran the MoreLikeThis code on them and 1/2 of the docs had stop words identified as interesting. So - I added code in to ignore stop words, but make it backward compatible so that by default this code is not used. git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@169512 13f79535-47bb-0310-9956-ffa450edef68		2005-05-10 19:29:56 +00:00
..
analyzers	overhaul of build system to facilitate building and packaging of contrib sub-projects. some work still to be done, but core Lucene build still working fine	2005-05-02 00:11:11 +00:00
ant	adjust project names for consistency	2005-05-06 00:24:18 +00:00
db	- reworked store I/O to use new IndexInput and IndexOutput classes	2005-05-02 20:06:00 +00:00
highlighter	Fixed bug where docs larger than maxDocBytesToAnalyze would cause last fragment to be sized as remainder of doc (which could be huge).	2005-05-05 22:40:45 +00:00
javascript	move two more projects over to contrib	2005-02-06 15:35:12 +00:00
lucli	overhaul of build system to facilitate building and packaging of contrib sub-projects. some work still to be done, but core Lucene build still working fine	2005-05-02 00:11:11 +00:00
memory	Wolfgang is non-stop with the additions. Easy enough to paste in, so here it is with a Collection-based TokenStream	2005-05-04 00:24:17 +00:00
miscellaneous	overhaul of build system to facilitate building and packaging of contrib sub-projects. some work still to be done, but core Lucene build still working fine	2005-05-02 00:11:11 +00:00
similarity	Logic ignored stop words were in a early version of this code but it was taken out in the belief that there	2005-05-10 19:29:56 +00:00
snowball	overhaul of build system to facilitate building and packaging of contrib sub-projects. some work still to be done, but core Lucene build still working fine	2005-05-02 00:11:11 +00:00
spellchecker	adjust code to fix compile/javadoc errors on JDK 1.5	2005-05-06 00:26:08 +00:00
swing	overhaul of build system to facilitate building and packaging of contrib sub-projects. some work still to be done, but core Lucene build still working fine	2005-05-02 00:11:11 +00:00
wordnet	rename WordNet to wordnet, required intermediate move due to OS case insensitivity	2005-05-06 00:32:00 +00:00
TODO.txt	add convenient TODO file to keep track of sandbox -> contrib move	2005-02-05 02:23:19 +00:00
contrib-build.xml	prefix all JARs with lucene-	2005-05-06 23:43:54 +00:00