Commit Graph

36 Commits

Author SHA1 Message Date
Dawid Weiss 81d8a18641 LUCENE-3971: MappingCharFilter could return invalid final token position.
(Dawid Weiss, Robert Muir)

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1326065 13f79535-47bb-0310-9956-ffa450edef68
2012-04-14 07:32:42 +00:00
Robert Muir 16f5be0efb LUCENE-3969: Test all ctors in TestRandomChains and fix bugs discovered by the test
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1324960 13f79535-47bb-0310-9956-ffa450edef68
2012-04-11 19:54:09 +00:00
Jan Høydahl 54d48eb98b SOLR-2764: Create a NorwegianLightStemmer and NorwegianMinimalStemmer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302833 13f79535-47bb-0310-9956-ffa450edef68
2012-03-20 10:57:50 +00:00
Dawid Weiss 7be5533989 LUCENE-3820: Wrong trailing index calculation in PatternReplaceCharFilter.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1294141 13f79535-47bb-0310-9956-ffa450edef68
2012-02-27 13:13:10 +00:00
Robert Muir 72ae3171be LUCENE-3765: Trappy behavior with StopFilter/ignoreCase
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1242497 13f79535-47bb-0310-9956-ffa450edef68
2012-02-09 19:59:50 +00:00
Robert Muir fbd34b4390 cleanups to 4.x CHANGES
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1231552 13f79535-47bb-0310-9956-ffa450edef68
2012-01-14 18:24:48 +00:00
Robert Muir 4ebdc0872a LUCENE-3305: sorry Mike (thanks for the help with the FST optimization)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1230756 13f79535-47bb-0310-9956-ffa450edef68
2012-01-12 20:24:40 +00:00
Robert Muir cd372bdc83 LUCENE-3305: add Kuromoji Japanese morphological analyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1230748 13f79535-47bb-0310-9956-ffa450edef68
2012-01-12 20:10:48 +00:00
Christopher John Male 318911200d LUCENE-3434: Removed state changing setters in ShingleAnalyzerWrapper and PerFieldAnalyzerWrapper
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1170942 13f79535-47bb-0310-9956-ffa450edef68
2011-09-15 03:21:17 +00:00
Christopher John Male 94028fe11a LUCENE-3431: Removed deprecated addStopwords methods in QueryAutoStopWordAnalyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1170424 13f79535-47bb-0310-9956-ffa450edef68
2011-09-14 03:33:50 +00:00
Christopher John Male 4c5606ee29 LUCENE-3396: Converted most Analyzers over to using ReusableAnalyzerBase
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1169607 13f79535-47bb-0310-9956-ffa450edef68
2011-09-12 05:50:26 +00:00
Christopher John Male e3172b9239 LUCENE-3414: Added Hunspell for Lucene
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1167467 13f79535-47bb-0310-9956-ffa450edef68
2011-09-10 06:00:39 +00:00
Robert Muir 128aaf8387 LUCENE-3410: move changes to 3.5 and nuke deprecated code in trunk
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1166770 13f79535-47bb-0310-9956-ffa450edef68
2011-09-08 15:56:01 +00:00
Christopher John Male 4b44bd7d83 LUCENE-3410: Deprecated multi-int constructors in WordDelimiterFilter. Now uses int bitfield
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1165995 13f79535-47bb-0310-9956-ffa450edef68
2011-09-07 04:43:10 +00:00
Christopher John Male 1057d24e7f LUCENE-3400: Removed DutchAnalyzer.setStemDictionary
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1161484 13f79535-47bb-0310-9956-ffa450edef68
2011-08-25 10:32:21 +00:00
Dawid Weiss 29b09032d3 LUCENE-2341: integrating morfologik (Polish stemming/ morphosyntactic dictionary) as an analysis module.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1141671 13f79535-47bb-0310-9956-ffa450edef68
2011-06-30 19:12:54 +00:00
Dawid Weiss f85c4e7c88 Reverting 1141022 (needs to wait for 1.6 support).
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1141032 13f79535-47bb-0310-9956-ffa450edef68
2011-06-29 10:00:36 +00:00
Dawid Weiss d188d3df90 LUCENE-2341: integrating morfologik (Polish stemming/ morphosyntactic dictionary) as an analysis module.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1141022 13f79535-47bb-0310-9956-ffa450edef68
2011-06-29 09:24:14 +00:00
Robert Muir 063d18e280 LUCENE-3163: add link to jira versions information to CHANGES.txt files
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1129656 13f79535-47bb-0310-9956-ffa450edef68
2011-05-31 13:03:40 +00:00
Robert Muir 4455345c6e LUCENE-3063: factor CharTokenizer/CharacterUtils into analyzers module
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1098871 13f79535-47bb-0310-9956-ffa450edef68
2011-05-03 00:29:47 +00:00
Robert Muir 308e0bd4a9 LUCENE-2514, LUCENE-2551: collation uses byte[] keys, deprecate old unscalable locale sort/range, termrangequery/filter work on bytes
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1075210 13f79535-47bb-0310-9956-ffa450edef68
2011-02-28 05:15:50 +00:00
Koji Sekiguchi 6f31407109 SOLR-1057: Add PathHierarchyTokenizer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1067131 13f79535-47bb-0310-9956-ffa450edef68
2011-02-04 10:19:52 +00:00
Steven Rowe 1b22e86417 LUCENE-2847: Support all of unicode, including supplementary code points above the basic multilingual plane, in StandardTokenizer and UAX29URLEmailTokenizer.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1055877 13f79535-47bb-0310-9956-ffa450edef68
2011-01-06 13:51:10 +00:00
Steven Rowe 2b9726ae81 LUCENE-2763: Swap URL+Email recognizing StandardTokenizer and UAX29Tokenizer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1043071 13f79535-47bb-0310-9956-ffa450edef68
2010-12-07 14:53:13 +00:00
Uwe Schindler 6f230c5e08 revert changes (will come in 3.x)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1029347 13f79535-47bb-0310-9956-ffa450edef68
2010-10-31 14:03:50 +00:00
Uwe Schindler 819344aeab LUCENE-2732: Fix charset problems in XML loading in HyphenationCompoundWordTokenFilter
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1029345 13f79535-47bb-0310-9956-ffa450edef68
2010-10-31 13:56:46 +00:00
Steven Rowe 7f6dd505f1 LUCENE-2699: Update StandardTokenizer and UAX29Tokenizer to Unicode 6.0.0
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1022826 13f79535-47bb-0310-9956-ffa450edef68
2010-10-15 05:41:54 +00:00
Steven Rowe f9e4f551e2 LUCENE-1370: Added ShingleFilter option to output unigrams if no shingles can be generated.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1006187 13f79535-47bb-0310-9956-ffa450edef68
2010-10-09 16:55:23 +00:00
Steven Rowe 3c26a9167c LUCENE-2167: Implement StandardTokenizer with the UAX#29 Standard
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1002032 13f79535-47bb-0310-9956-ffa450edef68
2010-09-28 06:16:16 +00:00
Robert Muir 8f71031ac8 LUCENE-2413: consolidate remaining solr tokenstreams into modules/analysis
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@957162 13f79535-47bb-0310-9956-ffa450edef68
2010-06-23 11:25:17 +00:00
Robert Muir a0c72afb31 LUCENE-2413: move more core analysis to analyzers module
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@948225 13f79535-47bb-0310-9956-ffa450edef68
2010-05-25 22:28:32 +00:00
Robert Muir 71b59ca566 LUCENE-2413: consolidate remaining concrete core analyzers to modules/analysis
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@948195 13f79535-47bb-0310-9956-ffa450edef68
2010-05-25 20:16:44 +00:00
Robert Muir 5259d7d90b LUCENE-2413: move KeywordMarkerFilter to analyzers module
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@946621 13f79535-47bb-0310-9956-ffa450edef68
2010-05-20 13:23:12 +00:00
Robert Muir 5ccb3ae286 LUCENE-2413: fold contrib/icu into analyzers module
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@946590 13f79535-47bb-0310-9956-ffa450edef68
2010-05-20 10:46:00 +00:00
Robert Muir 1e1296e6f8 sync all changes to reflect reality
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@941710 13f79535-47bb-0310-9956-ffa450edef68
2010-05-06 13:08:59 +00:00
Robert Muir bef21b3e18 LUCENE-2444: boilerplate stuff for the analyzers module
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@941369 13f79535-47bb-0310-9956-ffa450edef68
2010-05-05 16:27:58 +00:00