2935 Commits

Author SHA1 Message Date
Michael Busch
fab92e9494 Update trunk to version 2.4-dev
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@609846 13f79535-47bb-0310-9956-ffa450edef68
2008-01-08 02:43:24 +00:00
Michael McCandless
2677871bbb LUCENE-508: make sure SegmentTermEnum.prev() is accurate (= last term) after next() returns false
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@609780 13f79535-47bb-0310-9956-ffa450edef68
2008-01-07 21:15:48 +00:00
Michael McCandless
eaba22c72a Fixed a few issues uncovered by YourKit profiling:
* We were allocating 2X the size of each char block, but only
    actually using the first half!
  * Improved accuracy of numBytesAlloc tracking in DW
  * Small optimization to not use token.setTermText from DW


git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@609546 13f79535-47bb-0310-9956-ffa450edef68
2008-01-07 09:48:06 +00:00
Michael McCandless
393a1d0575 LUCENE-1119: small optimization to TermInfosWriter.add to take a char[] instead of Term/String
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@609378 13f79535-47bb-0310-9956-ffa450edef68
2008-01-06 19:29:45 +00:00
Michael McCandless
26bc874e62 LUCENE-1118: skip terms > 255 (by default) characters in length in StandardAnalyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@609330 13f79535-47bb-0310-9956-ffa450edef68
2008-01-06 15:37:44 +00:00
Michael McCandless
f0d5002066 LUCENE-1117: fix intermittent thread safety issue w/ EnwikiDocMaker
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@609080 13f79535-47bb-0310-9956-ffa450edef68
2008-01-05 01:51:53 +00:00
Grant Ingersoll
79e09db401 LUCENE-1103: Internal links should increment as all tokens do, since the first token is valid too
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@608989 13f79535-47bb-0310-9956-ffa450edef68
2008-01-04 20:36:23 +00:00
Grant Ingersoll
b18f6ae959 LUCENE-1103: The link is now incremented 1, but then the next token in the link is not incremented. This way, the link is not associated with the previous term. Instead it associated with the next term in the link, which would be the display tokens. If there are no display tokens, then it will take it's proper place in the token chain.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@608978 13f79535-47bb-0310-9956-ffa450edef68
2008-01-04 20:15:22 +00:00
Michael McCandless
d86944d06f fix javadoc warnings
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@608956 13f79535-47bb-0310-9956-ffa450edef68
2008-01-04 18:46:33 +00:00
Grant Ingersoll
f715fc6031 LUCENE-1103
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@608852 13f79535-47bb-0310-9956-ffa450edef68
2008-01-04 14:29:15 +00:00
Michael McCandless
2d633f98a2 LUCENE-1112: skip immense terms and mark a doc for deletion if it hits a non-aborting exception
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@608534 13f79535-47bb-0310-9956-ffa450edef68
2008-01-03 15:49:50 +00:00
Michael McCandless
f12862426a fix typo
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@608523 13f79535-47bb-0310-9956-ffa450edef68
2008-01-03 15:20:41 +00:00
Doron Cohen
9e65cd554f LUCENE-1116: contrib/benchmark quality package improvements (MRR, Trec1MQ)
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@608370 13f79535-47bb-0310-9956-ffa450edef68
2008-01-03 07:44:40 +00:00
Doron Cohen
40eb1cd53f LUCENE-766: test added for adding two fields with same
name but different term vector setting.


git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@608367 13f79535-47bb-0310-9956-ffa450edef68
2008-01-03 07:32:38 +00:00
Michael McCandless
263244312d LUCENE-1115: some small fixes to contrib/benchmark
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@608306 13f79535-47bb-0310-9956-ffa450edef68
2008-01-03 01:48:18 +00:00
Grant Ingersoll
ed893f770c LUCENE-1114: Updated example
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@608126 13f79535-47bb-0310-9956-ffa450edef68
2008-01-02 15:30:40 +00:00
Daniel Naber
0ae1e1a905 LUCENE-1113: fix for Document.getBoost() documentation
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607913 13f79535-47bb-0310-9956-ffa450edef68
2008-01-01 21:05:15 +00:00
Michael Busch
75473edb02 LUCENE-746: Fix error message in AnalyzingQueryParser.getPrefixQuery.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607832 13f79535-47bb-0310-9956-ffa450edef68
2008-01-01 12:49:44 +00:00
Grant Ingersoll
90a735441f LUCENE-1102: EnwikiDocMaker now adds a docid field
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607732 13f79535-47bb-0310-9956-ffa450edef68
2007-12-31 13:07:14 +00:00
Doron Cohen
f39f15ec43 trivial: fix typo.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607724 13f79535-47bb-0310-9956-ffa450edef68
2007-12-31 11:05:02 +00:00
Doron Cohen
ece8361ab5 LUCENE-749: ChainedFilter behavior fixed when logic of first filter is ANDNOT.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607606 13f79535-47bb-0310-9956-ffa450edef68
2007-12-30 22:47:59 +00:00
Doron Cohen
f4639c0ab0 LUCENE-1095: option added to StopFilter and QueryParser to consider position increments.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607591 13f79535-47bb-0310-9956-ffa450edef68
2007-12-30 21:19:17 +00:00
Doron Cohen
b367e863e6 LUCENE-1101: TokenStream.next(Token) reuse 'policy': calling Token.clear() should be responsibility of token producer.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607521 13f79535-47bb-0310-9956-ffa450edef68
2007-12-30 07:34:30 +00:00
Doron Cohen
efbd1260a9 Rename section "Javadocs" to "Javadocs for Official Releases",
Following discussion http://www.nabble.com/site-javadocs-link-broken-tt14507459.html


git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607350 13f79535-47bb-0310-9956-ffa450edef68
2007-12-28 22:43:36 +00:00
Grant Ingersoll
bd340a896d git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607291 13f79535-47bb-0310-9956-ffa450edef68 2007-12-28 17:08:26 +00:00
Grant Ingersoll
5c81934465 git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607290 13f79535-47bb-0310-9956-ffa450edef68 2007-12-28 17:08:16 +00:00
Grant Ingersoll
cb94c6aed4 git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607288 13f79535-47bb-0310-9956-ffa450edef68 2007-12-28 17:07:33 +00:00
Grant Ingersoll
40d85a7781 Switch to using the EnwikiDocMaker
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607281 13f79535-47bb-0310-9956-ffa450edef68
2007-12-28 16:29:03 +00:00
Grant Ingersoll
ac27fb02de Changed from overriding next(Token) to next()
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607246 13f79535-47bb-0310-9956-ffa450edef68
2007-12-28 14:17:48 +00:00
Grant Ingersoll
da2a912919 LUCENE-1068: updated StandardTokenizer, Analyzer to allow for the replaceInvalidAcronym
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607161 13f79535-47bb-0310-9956-ffa450edef68
2007-12-28 02:46:11 +00:00
Doron Cohen
93b9adc280 LUCENE-1099: Make Tokenizer.reset(Reader) public.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@606865 13f79535-47bb-0310-9956-ffa450edef68
2007-12-26 09:21:46 +00:00
Michael McCandless
9a9d138d8b remove 'implements Cloneable' from MergePolicy.MergeSpecification
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@606657 13f79535-47bb-0310-9956-ffa450edef68
2007-12-24 02:35:12 +00:00
Michael Busch
c0040adc27 LUCENE-1098: Make inner class StandardAnalyzer.SavedStreams static and final.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@606647 13f79535-47bb-0310-9956-ffa450edef68
2007-12-24 00:48:31 +00:00
Michael Busch
3084aecc85 Added news item to website about nightly maven snapshots
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@606641 13f79535-47bb-0310-9956-ffa450edef68
2007-12-23 23:57:33 +00:00
Doron Cohen
23da0335a5 LUCENE-1096: Fixed Hits behavior when hits' docs are deleted along with iterating the hits.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@606614 13f79535-47bb-0310-9956-ffa450edef68
2007-12-23 20:50:29 +00:00
Michael McCandless
89fe185d71 LUCENE-1097: change IndexWriter.close(false) to ask merge threads to abort, and, wait for them to finally finish
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@606441 13f79535-47bb-0310-9956-ffa450edef68
2007-12-22 10:06:28 +00:00
Grant Ingersoll
6e20e41418 should not have been committed
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@606305 13f79535-47bb-0310-9956-ffa450edef68
2007-12-21 20:45:35 +00:00
Michael McCandless
0c6efd3dee make sure to close the IndexInput used to read index version
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@606299 13f79535-47bb-0310-9956-ffa450edef68
2007-12-21 20:31:48 +00:00
Grant Ingersoll
7dfe984867 Checkin of WikpediaTokenizer that extends StandardTokenizer using JFlex. examples/wikipedia/README contains info on running.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@606292 13f79535-47bb-0310-9956-ffa450edef68
2007-12-21 20:08:24 +00:00
Grant Ingersoll
ca821526b0 removed bad chars at end of file
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@605966 13f79535-47bb-0310-9956-ffa450edef68
2007-12-20 15:33:16 +00:00
Grant Ingersoll
36b1206ad8 Restoring ExtractWikipedia, as it is still a handy class to have around. Splitting the documents is useful for debugging purposes when you know you want to look at a specific document instead of grepping through a really large file.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@605959 13f79535-47bb-0310-9956-ffa450edef68
2007-12-20 15:14:24 +00:00
Grant Ingersoll
7a3a61e45d Checking in simple performance test
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@605957 13f79535-47bb-0310-9956-ffa450edef68
2007-12-20 15:07:59 +00:00
Michael McCandless
52d307607c LUCENE-1094: don't corrupt fdt file on hitting an exception partway through indexing a document
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@605912 13f79535-47bb-0310-9956-ffa450edef68
2007-12-20 13:07:29 +00:00
Grant Ingersoll
be794a3832 LUCENE-25 added test for stopwords in query parser
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@605535 13f79535-47bb-0310-9956-ffa450edef68
2007-12-19 13:36:32 +00:00
Grant Ingersoll
516143dea0 LUCENE-1045: Applied original LUCENE-1045.patch that refactors original 1045 patch to use ExtendedFieldCache and DOES NOT make FieldCache a class
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@605225 13f79535-47bb-0310-9956-ffa450edef68
2007-12-18 15:13:05 +00:00
Michael McCandless
905674805c LUCENE-1092: fix KeywordAnalyzer.reusableTokenStream so it can successfully be reused
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@605149 13f79535-47bb-0310-9956-ffa450edef68
2007-12-18 09:20:04 +00:00
Michael McCandless
10c1ec3a66 LUCENE-1089: add new PriorityQueue.insertWithOverflow method to allow for re-use
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@604949 13f79535-47bb-0310-9956-ffa450edef68
2007-12-17 18:05:13 +00:00
Grant Ingersoll
55d0c3a2f8 LUCENE-1077: refactored to have a common PayloadHelper classes. Also added TokenOffsetPayloadTokenFilter, which encodes the Token offset into the payloads
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@604870 13f79535-47bb-0310-9956-ffa450edef68
2007-12-17 13:55:46 +00:00
Doron Cohen
b7e167ac8d LUCENE-1086: DocMakers setup for the "docs.dir" property fails when passing an absolute path.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@603856 13f79535-47bb-0310-9956-ffa450edef68
2007-12-13 08:58:52 +00:00
Doron Cohen
73f9e7ebc0 fix potential thread-safety issue in contrib/benchmark's TrecDocMaker.
(follow-up to http://svn.apache.org/viewvc?view=rev&revision=602475)


git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@603846 13f79535-47bb-0310-9956-ffa450edef68
2007-12-13 07:26:58 +00:00