Commit Graph

182 Commits

Author SHA1 Message Date
Uwe Schindler dad7e60253 LUCENE-2157: DelimitedPayloadTokenFilter no longer copies the buffer over itsself, instead it sets the length to the offset of the delimiter. Also optimizes logic and IdentityEncoder to use NIO.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@890791 13f79535-47bb-0310-9956-ffa450edef68
2009-12-15 13:27:27 +00:00
Simon Willnauer 6c0c318218 LUCENE-2100: Marked all contrib Analyzer subclasses as final. Analyzers should be only act as a composition of TokenStreams, users should compose their own analyzers instead of subclassing existing ones.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@888799 13f79535-47bb-0310-9956-ffa450edef68
2009-12-09 13:32:32 +00:00
Simon Willnauer 9ee4ce0fd5 LUCENE-2102: Add Turkish LowerCaseFilter which handles Turkish and Azeri unique casing behavior correctly.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@887535 13f79535-47bb-0310-9956-ffa450edef68
2009-12-05 12:46:05 +00:00
Simon Willnauer a0bf23d762 fixed javadoc warnings due to missing closing braces
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@887122 13f79535-47bb-0310-9956-ffa450edef68
2009-12-04 09:10:21 +00:00
Robert Muir 892bc7f55a LUCENE-2062: Bulgarian Analyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@886190 13f79535-47bb-0310-9956-ffa450edef68
2009-12-02 16:08:56 +00:00
Uwe Schindler 9edfb3b66a LUCENE-2094: Prepare CharArraySet for Unicode 4.0
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@885592 13f79535-47bb-0310-9956-ffa450edef68
2009-11-30 21:49:21 +00:00
Uwe Schindler 09fd7abd7a fix javadoc
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@885571 13f79535-47bb-0310-9956-ffa450edef68
2009-11-30 19:55:57 +00:00
Robert Muir 2ef402eefa LUCENE-2067: Add a stemmer for Czech
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@885216 13f79535-47bb-0310-9956-ffa450edef68
2009-11-29 11:59:38 +00:00
Robert Muir f0e064eb41 LUCENE-2069: supplementary char support for lowercasefilter
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@885024 13f79535-47bb-0310-9956-ffa450edef68
2009-11-27 21:34:11 +00:00
Simon Willnauer e69141c51a LUCENE-2068: Fixed ReverseStringFilter for Unicode 4.0. Reverse Supplementary Characters correctly.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@883149 13f79535-47bb-0310-9956-ffa450edef68
2009-11-22 21:09:42 +00:00
Simon Willnauer ba4769d418 Fixed JavaDoc - spelling issues in @param
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@880727 13f79535-47bb-0310-9956-ffa450edef68
2009-11-16 12:33:10 +00:00
Uwe Schindler 00f07ee460 LUCENE-2051: Contrib Analyzer Setters should be deprecated and replace with ctor arguments, thanks to Simon Willnauer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@880715 13f79535-47bb-0310-9956-ffa450edef68
2009-11-16 11:48:37 +00:00
Uwe Schindler 7370094ead Fix some javadocs errors in contrib
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@880706 13f79535-47bb-0310-9956-ffa450edef68
2009-11-16 11:08:25 +00:00
Uwe Schindler 945e7eda52 LUCENE-2052: add varargs where possible
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@836248 13f79535-47bb-0310-9956-ffa450edef68
2009-11-14 19:26:49 +00:00
Uwe Schindler 5b83cc59b2 LUCENE-1257: Generics: *heavy* Robert Muir & mine patch
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@834847 13f79535-47bb-0310-9956-ffa450edef68
2009-11-11 12:18:34 +00:00
Robert Muir 786eb6ce0d LUCENE-2012: add remaining @overrides (contrib,demo)
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@833867 13f79535-47bb-0310-9956-ffa450edef68
2009-11-08 12:45:12 +00:00
Robert Muir 80e8bfbbc9 LUCENE-2031: Move patternanalyzer from memory contrib into analyzers
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@832889 13f79535-47bb-0310-9956-ffa450edef68
2009-11-04 22:37:01 +00:00
Simon Willnauer a5da31ef90 Trivial fix of ignored return value of reader.read(). Done during hackathon - review by uschindler
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@832554 13f79535-47bb-0310-9956-ffa450edef68
2009-11-03 20:58:12 +00:00
Simon Willnauer e84f86d497 Trivial fix - changed new Character('_') into Character.valueOf('_')
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@832549 13f79535-47bb-0310-9956-ffa450edef68
2009-11-03 20:49:57 +00:00
Robert Muir 9b0c42a9c1 fix confusing smartcn javadoc bug
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@831913 13f79535-47bb-0310-9956-ffa450edef68
2009-11-02 15:08:42 +00:00
Robert Muir 066eac49a4 LUCENE-2022: remove deprecated api from contrib/analysis and wikipedia
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@831425 13f79535-47bb-0310-9956-ffa450edef68
2009-10-30 19:04:30 +00:00
Robert Muir cc374d7efc set RussianLowerCaseFilter deprecation to 4.0
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@831391 13f79535-47bb-0310-9956-ffa450edef68
2009-10-30 17:11:10 +00:00
Michael McCandless 13593aa802 LUCENE-2002: restore RussianLowerCaseFilter
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@831284 13f79535-47bb-0310-9956-ffa450edef68
2009-10-30 12:45:11 +00:00
Robert Muir 0733caac5f LUCENE-2021: use chararrayset in french elision filter
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@831268 13f79535-47bb-0310-9956-ffa450edef68
2009-10-30 11:25:10 +00:00
Robert Muir 8861ba2ffd LUCENE-2014: add a thai test to prevent any similar regression
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@831189 13f79535-47bb-0310-9956-ffa450edef68
2009-10-30 03:26:44 +00:00
Robert Muir 19e55ea991 LUCENE-1257: port smartchineseanalyzer to java 5
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@831121 13f79535-47bb-0310-9956-ffa450edef68
2009-10-29 22:29:50 +00:00
Robert Muir 1b38f9c24d LUCENE-2014: SmartChineseAnalyzer position increment bug
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@830871 13f79535-47bb-0310-9956-ffa450edef68
2009-10-29 09:22:37 +00:00
Michael McCandless 74f872182e fix some javadoc warnings
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@829817 13f79535-47bb-0310-9956-ffa450edef68
2009-10-26 14:55:51 +00:00
Uwe Schindler 7902c4b729 Remove the remaining deprecated ctors from TokenStream API test base class (BaseTokenStreamTestCase). They were used to test old and new TokenStream API.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@829244 13f79535-47bb-0310-9956-ffa450edef68
2009-10-23 21:21:17 +00:00
Michael McCandless aaddac8992 LUCENE-2002: add Version to QueryParser & contrib analyzers
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@829206 13f79535-47bb-0310-9956-ffa450edef68
2009-10-23 20:25:17 +00:00
Robert Muir d1fc6bece6 LUCENE-1359: FrenchAnalyzer tokenstream does not honor the contract of Analyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@828298 13f79535-47bb-0310-9956-ffa450edef68
2009-10-22 04:03:12 +00:00
Uwe Schindler 04da5e73f2 LUCENE-1998: Parameter -> Java 5 enum transition
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@828156 13f79535-47bb-0310-9956-ffa450edef68
2009-10-21 19:30:06 +00:00
Uwe Schindler 1ae5f89cfb LUCENE-1987: Remove rest of analysis deprecations (StandardAnalyzer, StopAnalyzer)
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@827979 13f79535-47bb-0310-9956-ffa450edef68
2009-10-21 12:12:11 +00:00
Robert Muir e053d80455 LUCENE-1966: ArabicAnalyzer stopwords cleanup
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@825110 13f79535-47bb-0310-9956-ffa450edef68
2009-10-14 12:24:18 +00:00
Uwe Schindler 4cded8042c LUCENE-1946, LUCENE-1753: Remove deprecated TokenStream API. What a pity, my wonderful backwards layer is gone! :-( Enforce decorator pattern by making the rest of TokenStreams final.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@824116 13f79535-47bb-0310-9956-ffa450edef68
2009-10-11 17:35:09 +00:00
Robert Muir 877c9ff521 For fa analyzer, add a test for custom stopwords
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@823546 13f79535-47bb-0310-9956-ffa450edef68
2009-10-09 13:27:14 +00:00
Robert Muir 956c8cda82 LUCENE-1963: Lowercase before stopfilter in ArabicAnalyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@823534 13f79535-47bb-0310-9956-ffa450edef68
2009-10-09 12:55:47 +00:00
Michael McCandless f20e419aff LUCENE-1950: remove autoCommit=true from IndexWriter
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@823321 13f79535-47bb-0310-9956-ffa450edef68
2009-10-08 20:57:32 +00:00
Simon Willnauer 05b7822170 LUCENE-1965: Lazy Atomic Loading Stopwords in SmartCN
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@823285 13f79535-47bb-0310-9956-ffa450edef68
2009-10-08 19:21:36 +00:00
Simon Willnauer 286cb1f9d2 LUCENE-1962: Cleaned up Persian & Arabic Analyzer. Prevent default stopword list from being loaded more than once.
- replace if blocks with a single switch
- marking private members final where needed
- changed protected visibility to final in final class.

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@823180 13f79535-47bb-0310-9956-ffa450edef68
2009-10-08 13:54:18 +00:00
Michael McCandless c11776d2c6 remove tags
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@822781 13f79535-47bb-0310-9956-ffa450edef68
2009-10-07 15:41:09 +00:00
Michael Busch d7d9241ef7 LUCENE-1856: Remove Hits.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@822587 13f79535-47bb-0310-9956-ffa450edef68
2009-10-07 05:08:22 +00:00
Karl-Johan Wettin b3f73db537 LUCENE-1939: IndexOutOfBoundsException at ShingleMatrixFilter's Iterator#hasNext method on exhausted streams.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@821888 13f79535-47bb-0310-9956-ffa450edef68
2009-10-05 16:01:17 +00:00
Uwe Schindler 236baf9fcb LUCENE-1944: Cleanup contrib to not use deprecated APIs
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@821444 13f79535-47bb-0310-9956-ffa450edef68
2009-10-03 23:24:33 +00:00
Robert Muir 8da43c4bb8 LUCENE-1916: smartcn hhmm doc translation
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@821325 13f79535-47bb-0310-9956-ffa450edef68
2009-10-03 14:24:45 +00:00
Robert Muir 1f9088b038 LUCENE-1943: Improve performance of ChineseFilter
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@821322 13f79535-47bb-0310-9956-ffa450edef68
2009-10-03 13:54:12 +00:00
Karl-Johan Wettin 4f878bdc93 LUCENE-1257: Generified ShingleMatrixFilter
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@821311 13f79535-47bb-0310-9956-ffa450edef68
2009-10-03 13:17:11 +00:00
Uwe Schindler 835de0b44d LUCENE-1833: Change all new Number() ctors to Number.valueOf()
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@821186 13f79535-47bb-0310-9956-ffa450edef68
2009-10-02 22:16:44 +00:00
Uwe Schindler af0e97fd72 LUCENE-1257: Replace StringBuffer by StringBuilder where possible
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@821185 13f79535-47bb-0310-9956-ffa450edef68
2009-10-02 22:11:10 +00:00
Robert Muir dd9c1b0101 LUCENE-1936: Remove deprecated charset support from Greek and Russian analyzers
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@820756 13f79535-47bb-0310-9956-ffa450edef68
2009-10-01 19:20:09 +00:00