Uwe Schindler
102ece7710
LUCENE-3969: More cleanups
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311282 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 15:32:08 +00:00
Uwe Schindler
214ab39f68
LUCENE-3969: Minor cleanups and code consistency
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311278 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 15:15:11 +00:00
Robert Muir
ac393486e0
LUCENE-3969: don't allow negative subword params, Hyphenation relies upon this to filter out what appear to be bogus hyphenation points
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311257 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 14:31:25 +00:00
Robert Muir
24f8a9e627
LUCENE-3969: disable PositionFilter for now
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311241 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 14:16:35 +00:00
Robert Muir
f63af6afe5
LUCENE-3969: don't be this evil yet for type char
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311235 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 13:44:18 +00:00
Robert Muir
6311f71de6
LUCENE-3969: commit current state
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311220 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 13:25:28 +00:00
Robert Muir
27dbcaefdc
revert bogus fix (assault against a police officer)
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1311113 13f79535-47bb-0310-9956-ffa450edef68
2012-04-08 22:10:08 +00:00
Robert Muir
00c2246e44
fix generification bug
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1311110 13f79535-47bb-0310-9956-ffa450edef68
2012-04-08 21:56:03 +00:00
Michael McCandless
c63f95911a
LUCENE-3942: syn filter sets posLen when possible
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1311100 13f79535-47bb-0310-9956-ffa450edef68
2012-04-08 20:55:32 +00:00
Michael McCandless
78b4be5dc6
LUCENE-3940: fix Kuromoji to not produce invalid token graph due to UNK with punctuation being decompounded
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1311072 13f79535-47bb-0310-9956-ffa450edef68
2012-04-08 19:17:17 +00:00
Michael McCandless
755ebafa49
LUCENE-3873: add MockGraphTokenFilter, inserting random graph tokens
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1310910 13f79535-47bb-0310-9956-ffa450edef68
2012-04-07 23:06:12 +00:00
Uwe Schindler
62890c8089
LUCENE-3919: Remove useless loop
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1310898 13f79535-47bb-0310-9956-ffa450edef68
2012-04-07 22:33:13 +00:00
Uwe Schindler
bdaa79206d
LUCENE-3919: Die, context class loader, die. Also don't initialize (run static ctors) unrelated classes!
...
@UweSays: "If you get the context classloader from a thread, in most cases you are doing something wrong because you don't understand how Java classloading works."
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1310893 13f79535-47bb-0310-9956-ffa450edef68
2012-04-07 22:27:57 +00:00
Uwe Schindler
7154c5466d
LUCENE-3919: Fix generics and additional checks
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1310883 13f79535-47bb-0310-9956-ffa450edef68
2012-04-07 22:00:28 +00:00
Robert Muir
ed485b29ec
add basic charfilter support to TestRandomChains
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1310805 13f79535-47bb-0310-9956-ffa450edef68
2012-04-07 17:37:16 +00:00
Robert Muir
fbc8429905
LUCENE-3919: more thorough testing of analysis chains
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1310789 13f79535-47bb-0310-9956-ffa450edef68
2012-04-07 15:48:02 +00:00
Chris M. Hostetter
bb7bc2ff44
LUCENE-3945: use sha1 checksums to verify jars pulled from ivy match expectations
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1309503 13f79535-47bb-0310-9956-ffa450edef68
2012-04-04 17:53:32 +00:00
Steven Rowe
0a47c9d4d9
nuke obsolete comment
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1309393 13f79535-47bb-0310-9956-ffa450edef68
2012-04-04 14:04:50 +00:00
Robert Muir
6c7c89c3f9
LUCENE-1866: add exclusion for bocchan test file
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1309255 13f79535-47bb-0310-9956-ffa450edef68
2012-04-04 05:36:52 +00:00
Robert Muir
2fe2e82584
LUCENE-1866: better RAT reporting
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1309248 13f79535-47bb-0310-9956-ffa450edef68
2012-04-04 05:03:53 +00:00
Robert Muir
e5448e2e20
LUCENE-3947: fix rat-sources task to work with tools/ directories
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1309207 13f79535-47bb-0310-9956-ffa450edef68
2012-04-04 01:51:56 +00:00
Robert Muir
6b16efdc22
LUCENE-3930: kuromoji steals icu's jar
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1308423 13f79535-47bb-0310-9956-ffa450edef68
2012-04-02 16:31:59 +00:00
Robert Muir
8f0d7cc135
LUCENE-3930: nuke jars from source tree and use ivy
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1307563 13f79535-47bb-0310-9956-ffa450edef68
2012-03-30 18:04:43 +00:00
Ryan McKinley
49f43806a8
LUCENE-2000: remove redundant casts
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1307012 13f79535-47bb-0310-9956-ffa450edef68
2012-03-29 17:34:34 +00:00
Michael McCandless
e49b69d459
tests: get JRE bug workaround working for this test again
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1306931 13f79535-47bb-0310-9956-ffa450edef68
2012-03-29 15:43:03 +00:00
Ryan McKinley
05fe168961
LUCENE-2000: clone() now returns covariant types where possible.
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1306626 13f79535-47bb-0310-9956-ffa450edef68
2012-03-28 22:22:25 +00:00
Christian Moen
ec18632428
Fixed various related to config and user dictionaries for Kuromoji (SOLR-3276)
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1306476 13f79535-47bb-0310-9956-ffa450edef68
2012-03-28 17:20:48 +00:00
Robert Muir
bca62a44d3
LUCENE-3929: add a test demonstrating this works
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305870 13f79535-47bb-0310-9956-ffa450edef68
2012-03-27 15:16:42 +00:00
Robert Muir
620f9a5739
small opto when charfilter is used: don't call this method twice in end
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305742 13f79535-47bb-0310-9956-ffa450edef68
2012-03-27 06:06:51 +00:00
Robert Muir
ae0f44fcb9
remaining eol-style fixes to trunk, native except .sh (LF)
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305492 13f79535-47bb-0310-9956-ffa450edef68
2012-03-26 18:57:08 +00:00
Robert Muir
a29a14698e
fix eol-style
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305339 13f79535-47bb-0310-9956-ffa450edef68
2012-03-26 12:58:58 +00:00
Christian Moen
f5770479e3
Move and rename Kuromoji (LUCENE-3909)
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305297 13f79535-47bb-0310-9956-ffa450edef68
2012-03-26 10:31:48 +00:00
Robert Muir
35705cc396
LUCENE-3919: fix czechstemmer aioobe on the empty term
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305177 13f79535-47bb-0310-9956-ffa450edef68
2012-03-25 23:40:44 +00:00
Michael McCandless
cb1a9a0cdf
LUCENE-3897: if best scoring path is ahead of current pos, move forward
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305149 13f79535-47bb-0310-9956-ffa450edef68
2012-03-25 21:37:55 +00:00
Michael McCandless
a278ba7a0c
LUCENE-3897: fix silly bug in forced backtrace
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305086 13f79535-47bb-0310-9956-ffa450edef68
2012-03-25 17:51:26 +00:00
Christian Moen
c3ddb9dc67
Added KuromojiReadingFormFilter (LUCENE-3915)
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305046 13f79535-47bb-0310-9956-ffa450edef68
2012-03-25 14:17:23 +00:00
Steven Rowe
fb33754168
LUCENE-3881: Added UAX29URLEmailAnalyzer
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304975 13f79535-47bb-0310-9956-ffa450edef68
2012-03-25 01:20:55 +00:00
Steven Rowe
ada9780484
LUCENE-3913: Fix HTMLStripCharFilter invalid final offset for input containing </br>
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304912 13f79535-47bb-0310-9956-ffa450edef68
2012-03-24 20:54:31 +00:00
Robert Muir
f597b9a1cc
LUCENE-3883: Irish Analyzer
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304836 13f79535-47bb-0310-9956-ffa450edef68
2012-03-24 15:59:04 +00:00
Christian Moen
63f1c48b7d
Added katakana stem filter (LUCENE-3901)
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304719 13f79535-47bb-0310-9956-ffa450edef68
2012-03-24 06:38:53 +00:00
Michael McCandless
7291d38535
LUCENE-3905: sometimes run real-ish content (from LineFileDocs) through the analyzers too; fix end() offset bugs in the ngram tokenizers/filters
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304525 13f79535-47bb-0310-9956-ffa450edef68
2012-03-23 17:39:13 +00:00
Robert Muir
86c2da0eac
happy new year
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303828 13f79535-47bb-0310-9956-ffa450edef68
2012-03-22 15:21:17 +00:00
Robert Muir
c3305a50ff
add some more kuromoji javadocs
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303746 13f79535-47bb-0310-9956-ffa450edef68
2012-03-22 12:21:48 +00:00
Christian Moen
d2eebf9330
Fix for LUCENE-3897 (KuromojiTokenizer fails with large docs)
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303739 13f79535-47bb-0310-9956-ffa450edef68
2012-03-22 11:41:54 +00:00
Robert Muir
a6fd306dfb
add missing license headers
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303738 13f79535-47bb-0310-9956-ffa450edef68
2012-03-22 11:33:45 +00:00
Michael McCandless
1a191f4edc
LUCENE-3898: reset() was missing some state
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303441 13f79535-47bb-0310-9956-ffa450edef68
2012-03-21 15:22:28 +00:00
Robert Muir
fb395f66a3
use MockTokenizer instead of WhitespaceTokenizer for better testing
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303382 13f79535-47bb-0310-9956-ffa450edef68
2012-03-21 13:10:38 +00:00
Michael McCandless
595744089a
LUCENE-3896: CharacterUtils.fill must call Reader.read again if it only got a single high surrogate char on the first read
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303374 13f79535-47bb-0310-9956-ffa450edef68
2012-03-21 12:53:27 +00:00
Robert Muir
f75d40dad5
LUCENE-3894: try toning down for this tokenizer (it builds lots of tokens from the input treated as a path)
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303276 13f79535-47bb-0310-9956-ffa450edef68
2012-03-21 04:30:11 +00:00
Robert Muir
1156de050f
LUCENE-3894: add large docs tests for more tokenizers
...
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303273 13f79535-47bb-0310-9956-ffa450edef68
2012-03-21 03:59:14 +00:00