Commit Graph

275 Commits

Author SHA1 Message Date
Robert Muir 776e1b4a98 LUCENE-3990: don't extend charfilter here, delegate all methods
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1326468 13f79535-47bb-0310-9956-ffa450edef68
2012-04-16 02:56:43 +00:00
Robert Muir 1b48bfd173 LUCENE-3990: only set offsetsAreCorrect=false if we are actually going to use that tokenizer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1326457 13f79535-47bb-0310-9956-ffa450edef68
2012-04-16 01:31:55 +00:00
Dawid Weiss cf85aab1a0 LUCENE-3808: Switch LuceneTestCaseRunner to RandomizedRunner. Enforce Random sharing contracts. Enforce thread leaks.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1326351 13f79535-47bb-0310-9956-ffa450edef68
2012-04-15 14:41:44 +00:00
Robert Muir f3536126ba LUCENE-3965: contrib-build -> modules-build
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1326247 13f79535-47bb-0310-9956-ffa450edef68
2012-04-15 02:07:08 +00:00
Robert Muir 29d790612e LUCENE-3971: re-enable this filter
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1326137 13f79535-47bb-0310-9956-ffa450edef68
2012-04-14 16:01:27 +00:00
Uwe Schindler d3b73b2ec4 LUCENE-3971: Remove dead code
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1326082 13f79535-47bb-0310-9956-ffa450edef68
2012-04-14 10:13:14 +00:00
Dawid Weiss 81d8a18641 LUCENE-3971: MappingCharFilter could return invalid final token position.
(Dawid Weiss, Robert Muir)

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1326065 13f79535-47bb-0310-9956-ffa450edef68
2012-04-14 07:32:42 +00:00
Robert Muir e8008068b2 LUCENE-3969: Merged /lucene/dev/trunk:r1311219-1324765
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1324945 13f79535-47bb-0310-9956-ffa450edef68
2012-04-11 19:41:06 +00:00
Robert Muir a1c1ac512b LUCENE-3969: this filter currently doesnt handle graph inputs
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1324930 13f79535-47bb-0310-9956-ffa450edef68
2012-04-11 19:30:25 +00:00
Robert Muir c845af5497 LUCENE-3969: clean up nocommits
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1324834 13f79535-47bb-0310-9956-ffa450edef68
2012-04-11 16:01:07 +00:00
Robert Muir 974ea5ee34 LUCENE-3969: add mappingcharfilter to broken list until its bug is fixed
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1324751 13f79535-47bb-0310-9956-ffa450edef68
2012-04-11 13:15:33 +00:00
Robert Muir 14928d42c6 LUCENE-3969: add hack for MockLookahead's asserts
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1324749 13f79535-47bb-0310-9956-ffa450edef68
2012-04-11 13:08:10 +00:00
Robert Muir 69fafd4791 LUCENE-3969: clear this in reset()
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1324747 13f79535-47bb-0310-9956-ffa450edef68
2012-04-11 13:05:22 +00:00
Robert Muir bf2549a27b LUCENE-3969: add hack for MockGraph's asserts
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1324734 13f79535-47bb-0310-9956-ffa450edef68
2012-04-11 12:23:15 +00:00
Robert Muir 71291daa74 LUCENE-3969: when outputting a bigram token, mark posLen=2 to note that it spans two tokens
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1324727 13f79535-47bb-0310-9956-ffa450edef68
2012-04-11 12:16:31 +00:00
Michael McCandless 5bae28d57e LUCENE-3970: rename getUniqueTerm/FieldCount() to size()
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1312037 13f79535-47bb-0310-9956-ffa450edef68
2012-04-10 23:21:39 +00:00
Robert Muir 6954ba2410 LUCENE-3969: fix BaseTokenTest to do the same work in multi-threads that it did in single-threads, so it really shouldnt fail from another thread unless you have an actual thread problem
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311950 13f79535-47bb-0310-9956-ffa450edef68
2012-04-10 19:31:01 +00:00
Michael McCandless a9535971f3 disable test until we can fix syn filter to consume graphs
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1311937 13f79535-47bb-0310-9956-ffa450edef68
2012-04-10 19:18:15 +00:00
Uwe Schindler 842a54c290 LUCENE-3969: revert Whitespace
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311920 13f79535-47bb-0310-9956-ffa450edef68
2012-04-10 18:50:54 +00:00
Robert Muir c58dfd5516 LUCENE-3969: demote the n-grams again (with explanation)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311915 13f79535-47bb-0310-9956-ffa450edef68
2012-04-10 18:36:34 +00:00
Robert Muir ad994d8281 LUCENE-3969: promote edgeNgrams from 'totally broken list' to 'broken offsets list'
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311869 13f79535-47bb-0310-9956-ffa450edef68
2012-04-10 17:02:11 +00:00
Michael McCandless b67e7a0a9b LUCENE-3969: make full offset checking optional and disable for the known (buggy) offenders
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311864 13f79535-47bb-0310-9956-ffa450edef68
2012-04-10 16:54:54 +00:00
Robert Muir 6563a58a2a LUCENE-3969: add new random test for MappingCharFilter (sometimes fails, due to same final offset bug)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311765 13f79535-47bb-0310-9956-ffa450edef68
2012-04-10 14:49:36 +00:00
Robert Muir f97ac2d0cb LUCENE-3969: add failing test case for MappingCharFilter wrong final offset
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311761 13f79535-47bb-0310-9956-ffa450edef68
2012-04-10 14:38:39 +00:00
Robert Muir 8966429dab LUCENE-3969: disable these for now so we can work on the other issues
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311748 13f79535-47bb-0310-9956-ffa450edef68
2012-04-10 14:19:09 +00:00
Uwe Schindler 3706fbc5b0 Fix ShingleFilter reuse, some minor changes to testcase for speed and consistency
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311724 13f79535-47bb-0310-9956-ffa450edef68
2012-04-10 13:50:03 +00:00
Michael McCandless a764c0d021 LUCENE-3969: add whitespace to analyzer description
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311667 13f79535-47bb-0310-9956-ffa450edef68
2012-04-10 10:28:24 +00:00
Michael McCandless 3e098abaed LUCENE-3969: ValidatingTokenFilter shouldn't create new atts
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311405 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 20:00:50 +00:00
Michael McCandless 11a65763d0 LUCENE-3969: remove nocommit
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311400 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 19:45:16 +00:00
Michael McCandless ad5c89b1b1 LUCENE-3969: validate after each analysis stage; tenatively add posLen to ShingleFilter
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311373 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 19:05:47 +00:00
Uwe Schindler f6f8e38cfa LUCENE-3969: Simplify the crazy Reader wrapper
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311358 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 17:53:27 +00:00
Robert Muir f41576a306 LUCENE-3969: don't get caught by tokenizers that consume in ctor and throw IAE or UOE ever again
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311351 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 17:32:39 +00:00
Robert Muir 2a01acc0e8 LUCENE-3969: don't use scary attsource ctor yet, and always print the analyzer for now
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311339 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 17:21:46 +00:00
Uwe Schindler 79baa1f682 LUCENE-3969: Remove unneeded wildcards
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311331 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 17:08:19 +00:00
Uwe Schindler eae8e8159d LUCENE-3969: Remove useless success variable
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311322 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 16:56:35 +00:00
Uwe Schindler bd8bdb08b3 LUCENE-3969: Remove code duplication
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311320 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 16:52:14 +00:00
Michael McCandless 4456273922 LUCENE-3969: fix PatternTokenizer to not consume chars from the input Reader if it throws IAE
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311318 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 16:47:56 +00:00
Michael McCandless d76a03214c LUCENE-3969: add missing IAE to WikipediaTokenizer ctor
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311294 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 16:00:41 +00:00
Uwe Schindler 102ece7710 LUCENE-3969: More cleanups
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311282 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 15:32:08 +00:00
Uwe Schindler 214ab39f68 LUCENE-3969: Minor cleanups and code consistency
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311278 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 15:15:11 +00:00
Robert Muir ac393486e0 LUCENE-3969: don't allow negative subword params, Hyphenation relies upon this to filter out what appear to be bogus hyphenation points
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311257 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 14:31:25 +00:00
Robert Muir 24f8a9e627 LUCENE-3969: disable PositionFilter for now
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311241 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 14:16:35 +00:00
Robert Muir f63af6afe5 LUCENE-3969: don't be this evil yet for type char
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311235 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 13:44:18 +00:00
Robert Muir 6311f71de6 LUCENE-3969: commit current state
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3969@1311220 13f79535-47bb-0310-9956-ffa450edef68
2012-04-09 13:25:28 +00:00
Robert Muir 27dbcaefdc revert bogus fix (assault against a police officer)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1311113 13f79535-47bb-0310-9956-ffa450edef68
2012-04-08 22:10:08 +00:00
Robert Muir 00c2246e44 fix generification bug
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1311110 13f79535-47bb-0310-9956-ffa450edef68
2012-04-08 21:56:03 +00:00
Michael McCandless c63f95911a LUCENE-3942: syn filter sets posLen when possible
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1311100 13f79535-47bb-0310-9956-ffa450edef68
2012-04-08 20:55:32 +00:00
Michael McCandless 755ebafa49 LUCENE-3873: add MockGraphTokenFilter, inserting random graph tokens
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1310910 13f79535-47bb-0310-9956-ffa450edef68
2012-04-07 23:06:12 +00:00
Uwe Schindler 62890c8089 LUCENE-3919: Remove useless loop
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1310898 13f79535-47bb-0310-9956-ffa450edef68
2012-04-07 22:33:13 +00:00
Uwe Schindler bdaa79206d LUCENE-3919: Die, context class loader, die. Also don't initialize (run static ctors) unrelated classes!
@UweSays: "If you get the context classloader from a thread, in most cases you are doing something wrong because you don't understand how Java classloading works."

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1310893 13f79535-47bb-0310-9956-ffa450edef68
2012-04-07 22:27:57 +00:00