Robert Muir
|
8f0d7cc135
|
LUCENE-3930: nuke jars from source tree and use ivy
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1307563 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-30 18:04:43 +00:00 |
Ryan McKinley
|
49f43806a8
|
LUCENE-2000: remove redundant casts
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1307012 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-29 17:34:34 +00:00 |
Michael McCandless
|
e49b69d459
|
tests: get JRE bug workaround working for this test again
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1306931 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-29 15:43:03 +00:00 |
Ryan McKinley
|
05fe168961
|
LUCENE-2000: clone() now returns covariant types where possible.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1306626 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-28 22:22:25 +00:00 |
Christian Moen
|
ec18632428
|
Fixed various related to config and user dictionaries for Kuromoji (SOLR-3276)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1306476 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-28 17:20:48 +00:00 |
Robert Muir
|
bca62a44d3
|
LUCENE-3929: add a test demonstrating this works
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305870 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-27 15:16:42 +00:00 |
Robert Muir
|
620f9a5739
|
small opto when charfilter is used: don't call this method twice in end
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305742 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-27 06:06:51 +00:00 |
Robert Muir
|
ae0f44fcb9
|
remaining eol-style fixes to trunk, native except .sh (LF)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305492 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-26 18:57:08 +00:00 |
Robert Muir
|
a29a14698e
|
fix eol-style
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305339 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-26 12:58:58 +00:00 |
Christian Moen
|
f5770479e3
|
Move and rename Kuromoji (LUCENE-3909)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305297 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-26 10:31:48 +00:00 |
Robert Muir
|
35705cc396
|
LUCENE-3919: fix czechstemmer aioobe on the empty term
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305177 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-25 23:40:44 +00:00 |
Michael McCandless
|
cb1a9a0cdf
|
LUCENE-3897: if best scoring path is ahead of current pos, move forward
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305149 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-25 21:37:55 +00:00 |
Michael McCandless
|
a278ba7a0c
|
LUCENE-3897: fix silly bug in forced backtrace
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305086 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-25 17:51:26 +00:00 |
Christian Moen
|
c3ddb9dc67
|
Added KuromojiReadingFormFilter (LUCENE-3915)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305046 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-25 14:17:23 +00:00 |
Steven Rowe
|
fb33754168
|
LUCENE-3881: Added UAX29URLEmailAnalyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304975 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-25 01:20:55 +00:00 |
Steven Rowe
|
ada9780484
|
LUCENE-3913: Fix HTMLStripCharFilter invalid final offset for input containing </br>
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304912 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-24 20:54:31 +00:00 |
Robert Muir
|
f597b9a1cc
|
LUCENE-3883: Irish Analyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304836 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-24 15:59:04 +00:00 |
Christian Moen
|
63f1c48b7d
|
Added katakana stem filter (LUCENE-3901)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304719 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-24 06:38:53 +00:00 |
Michael McCandless
|
7291d38535
|
LUCENE-3905: sometimes run real-ish content (from LineFileDocs) through the analyzers too; fix end() offset bugs in the ngram tokenizers/filters
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304525 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-23 17:39:13 +00:00 |
Robert Muir
|
86c2da0eac
|
happy new year
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303828 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-22 15:21:17 +00:00 |
Robert Muir
|
c3305a50ff
|
add some more kuromoji javadocs
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303746 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-22 12:21:48 +00:00 |
Christian Moen
|
d2eebf9330
|
Fix for LUCENE-3897 (KuromojiTokenizer fails with large docs)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303739 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-22 11:41:54 +00:00 |
Robert Muir
|
a6fd306dfb
|
add missing license headers
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303738 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-22 11:33:45 +00:00 |
Michael McCandless
|
1a191f4edc
|
LUCENE-3898: reset() was missing some state
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303441 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-21 15:22:28 +00:00 |
Robert Muir
|
fb395f66a3
|
use MockTokenizer instead of WhitespaceTokenizer for better testing
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303382 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-21 13:10:38 +00:00 |
Michael McCandless
|
595744089a
|
LUCENE-3896: CharacterUtils.fill must call Reader.read again if it only got a single high surrogate char on the first read
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303374 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-21 12:53:27 +00:00 |
Robert Muir
|
f75d40dad5
|
LUCENE-3894: try toning down for this tokenizer (it builds lots of tokens from the input treated as a path)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303276 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-21 04:30:11 +00:00 |
Robert Muir
|
1156de050f
|
LUCENE-3894: add large docs tests for more tokenizers
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303273 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-21 03:59:14 +00:00 |
Robert Muir
|
dd7bfc78d9
|
LUCENE-3894: for tokenizers, add some tests for larger documents
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303258 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-21 02:54:07 +00:00 |
Robert Muir
|
3d73a3014e
|
LUCENE-3896: beef up TestDuelingAnalyzers for larger documents
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303253 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-21 01:52:22 +00:00 |
Michael McCandless
|
c20242721f
|
LUCENE-3894: some tokenizers weren't reading all input chars
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303193 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-20 23:02:37 +00:00 |
Robert Muir
|
b7a7e5a625
|
LUCENE-3889: remove unnecessary/unused base class
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303026 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-20 17:28:26 +00:00 |
Jan Høydahl
|
5648222e86
|
SOLR-2764: Fix testcase for minimal stemmer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302872 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-20 13:12:39 +00:00 |
Jan Høydahl
|
54d48eb98b
|
SOLR-2764: Create a NorwegianLightStemmer and NorwegianMinimalStemmer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302833 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-20 10:57:50 +00:00 |
Robert Muir
|
790323780f
|
basic javadocs improvements, mostly simple descriptions where the class had nothing before
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302752 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-20 02:09:25 +00:00 |
Robert Muir
|
4a2b1d974a
|
javadocs: add missing package.htmls
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302713 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-19 23:20:25 +00:00 |
Steven Rowe
|
c4f72f61ac
|
LUCENE-3880: UAX29URLEmailTokenizer now recognizes emails when the mailto: scheme is prepended.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302265 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-19 03:13:52 +00:00 |
Robert Muir
|
3d2d144f92
|
LUCENE-3848: don't produce tokenstreams that start with posinc=0
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1301478 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-16 13:06:30 +00:00 |
Uwe Schindler
|
3d8b22ffd0
|
LUCENE-3850: Fix rawtypes warnings for Java 7 compiler (#2)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1297162 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-05 18:48:04 +00:00 |
Uwe Schindler
|
989530e17e
|
LUCENE-3850: Fix rawtypes warnings for Java 7 compiler
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1297048 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-05 13:34:40 +00:00 |
Christian Moen
|
430365f7cc
|
Kuromoji now produces both compound words and the segmentation of those words in search mode (LUCENE-3767)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1296805 13f79535-47bb-0310-9956-ffa450edef68
|
2012-03-04 13:34:13 +00:00 |
Dawid Weiss
|
8c2e3cef8f
|
LUCENE-3820: limiting the amount of input for pattern matching to go past exponential time patterns, even if they happen. A nice catch from Mike too -- un-ignore testNastyPattern and look at processing time go wild with each additional input character...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1294797 13f79535-47bb-0310-9956-ffa450edef68
|
2012-02-28 19:26:05 +00:00 |
Dawid Weiss
|
f3cc65733b
|
Sysout of the randomized pattern.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1294518 13f79535-47bb-0310-9956-ffa450edef68
|
2012-02-28 08:15:38 +00:00 |
Dawid Weiss
|
4d401ca87d
|
Test thread's name reflects the current seed.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1294514 13f79535-47bb-0310-9956-ffa450edef68
|
2012-02-28 08:04:42 +00:00 |
Dawid Weiss
|
493bd8b42f
|
LUCENE-3820: optimistic limit on running time for the randomized pattern test. This doesn't eliminate the possibility of hitting an exponential time pattern, but I re-run a few times and it seems to be pretty stbale.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1294322 13f79535-47bb-0310-9956-ffa450edef68
|
2012-02-27 20:50:24 +00:00 |
Dawid Weiss
|
7be5533989
|
LUCENE-3820: Wrong trailing index calculation in PatternReplaceCharFilter.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1294141 13f79535-47bb-0310-9956-ffa450edef68
|
2012-02-27 13:13:10 +00:00 |
Tommaso Teofili
|
482c0610fd
|
[LUCENE-3731] - refactored analyzeText method to initializeIterator and made it abstract inside BaseUIMATokenizer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1293614 13f79535-47bb-0310-9956-ffa450edef68
|
2012-02-25 14:14:00 +00:00 |
Tommaso Teofili
|
930816cc5b
|
LUCENE-3731 - AEProviderFactory getAEProvider logic cleaned
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1292585 13f79535-47bb-0310-9956-ffa450edef68
|
2012-02-22 23:39:51 +00:00 |
Robert Muir
|
e51795be39
|
LUCENE-3731: remove unnecessary code
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1244714 13f79535-47bb-0310-9956-ffa450edef68
|
2012-02-15 20:53:53 +00:00 |
Robert Muir
|
c97e3edbb9
|
LUCENE-3731: performance improvements and thread safety fixes to UIMA tokenizers
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1244688 13f79535-47bb-0310-9956-ffa450edef68
|
2012-02-15 20:29:20 +00:00 |