Commit Graph

673 Commits

Author SHA1 Message Date
Christian Moen ec18632428 Fixed various related to config and user dictionaries for Kuromoji (SOLR-3276)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1306476 13f79535-47bb-0310-9956-ffa450edef68
2012-03-28 17:20:48 +00:00
Robert Muir bca62a44d3 LUCENE-3929: add a test demonstrating this works
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305870 13f79535-47bb-0310-9956-ffa450edef68
2012-03-27 15:16:42 +00:00
Robert Muir 620f9a5739 small opto when charfilter is used: don't call this method twice in end
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305742 13f79535-47bb-0310-9956-ffa450edef68
2012-03-27 06:06:51 +00:00
Robert Muir ae0f44fcb9 remaining eol-style fixes to trunk, native except .sh (LF)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305492 13f79535-47bb-0310-9956-ffa450edef68
2012-03-26 18:57:08 +00:00
Robert Muir a29a14698e fix eol-style
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305339 13f79535-47bb-0310-9956-ffa450edef68
2012-03-26 12:58:58 +00:00
Christian Moen f5770479e3 Move and rename Kuromoji (LUCENE-3909)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305297 13f79535-47bb-0310-9956-ffa450edef68
2012-03-26 10:31:48 +00:00
Robert Muir 35705cc396 LUCENE-3919: fix czechstemmer aioobe on the empty term
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305177 13f79535-47bb-0310-9956-ffa450edef68
2012-03-25 23:40:44 +00:00
Michael McCandless cb1a9a0cdf LUCENE-3897: if best scoring path is ahead of current pos, move forward
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305149 13f79535-47bb-0310-9956-ffa450edef68
2012-03-25 21:37:55 +00:00
Michael McCandless a278ba7a0c LUCENE-3897: fix silly bug in forced backtrace
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305086 13f79535-47bb-0310-9956-ffa450edef68
2012-03-25 17:51:26 +00:00
Christian Moen c3ddb9dc67 Added KuromojiReadingFormFilter (LUCENE-3915)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1305046 13f79535-47bb-0310-9956-ffa450edef68
2012-03-25 14:17:23 +00:00
Steven Rowe fb33754168 LUCENE-3881: Added UAX29URLEmailAnalyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304975 13f79535-47bb-0310-9956-ffa450edef68
2012-03-25 01:20:55 +00:00
Steven Rowe ada9780484 LUCENE-3913: Fix HTMLStripCharFilter invalid final offset for input containing </br>
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304912 13f79535-47bb-0310-9956-ffa450edef68
2012-03-24 20:54:31 +00:00
Robert Muir f597b9a1cc LUCENE-3883: Irish Analyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304836 13f79535-47bb-0310-9956-ffa450edef68
2012-03-24 15:59:04 +00:00
Christian Moen 63f1c48b7d Added katakana stem filter (LUCENE-3901)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304719 13f79535-47bb-0310-9956-ffa450edef68
2012-03-24 06:38:53 +00:00
Michael McCandless 7291d38535 LUCENE-3905: sometimes run real-ish content (from LineFileDocs) through the analyzers too; fix end() offset bugs in the ngram tokenizers/filters
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304525 13f79535-47bb-0310-9956-ffa450edef68
2012-03-23 17:39:13 +00:00
Michael McCandless 8f21ee61cb make the connetion between TPBJC's trackScores and TPBJQ's ScoreMode more transparent
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304269 13f79535-47bb-0310-9956-ffa450edef68
2012-03-23 10:33:17 +00:00
Simon Willnauer 6c73b26c93 LUCENE-3902: fix javadocs
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304219 13f79535-47bb-0310-9956-ffa450edef68
2012-03-23 08:05:02 +00:00
Michael McCandless adfac074ec use singleton BytesRefIterator.EMPTY (no public class)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1304134 13f79535-47bb-0310-9956-ffa450edef68
2012-03-22 23:33:19 +00:00
Martijn van Groningen 89b55566d8 LUCENE-3778: Added a grouping utility class that makes it easier to use result grouping for pure Lucene apps.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303871 13f79535-47bb-0310-9956-ffa450edef68
2012-03-22 16:15:42 +00:00
Robert Muir 86c2da0eac happy new year
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303828 13f79535-47bb-0310-9956-ffa450edef68
2012-03-22 15:21:17 +00:00
Robert Muir c3305a50ff add some more kuromoji javadocs
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303746 13f79535-47bb-0310-9956-ffa450edef68
2012-03-22 12:21:48 +00:00
Christian Moen d2eebf9330 Fix for LUCENE-3897 (KuromojiTokenizer fails with large docs)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303739 13f79535-47bb-0310-9956-ffa450edef68
2012-03-22 11:41:54 +00:00
Robert Muir a6fd306dfb add missing license headers
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303738 13f79535-47bb-0310-9956-ffa450edef68
2012-03-22 11:33:45 +00:00
Michael McCandless 1a191f4edc LUCENE-3898: reset() was missing some state
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303441 13f79535-47bb-0310-9956-ffa450edef68
2012-03-21 15:22:28 +00:00
Robert Muir fb395f66a3 use MockTokenizer instead of WhitespaceTokenizer for better testing
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303382 13f79535-47bb-0310-9956-ffa450edef68
2012-03-21 13:10:38 +00:00
Michael McCandless 595744089a LUCENE-3896: CharacterUtils.fill must call Reader.read again if it only got a single high surrogate char on the first read
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303374 13f79535-47bb-0310-9956-ffa450edef68
2012-03-21 12:53:27 +00:00
Martijn van Groningen 1d642b3cd7 LUCENE-3444: Added a second pass grouping collector that keeps track of distinct values for a specified field for the top N group.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303370 13f79535-47bb-0310-9956-ffa450edef68
2012-03-21 12:41:06 +00:00
Robert Muir f75d40dad5 LUCENE-3894: try toning down for this tokenizer (it builds lots of tokens from the input treated as a path)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303276 13f79535-47bb-0310-9956-ffa450edef68
2012-03-21 04:30:11 +00:00
Robert Muir 1156de050f LUCENE-3894: add large docs tests for more tokenizers
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303273 13f79535-47bb-0310-9956-ffa450edef68
2012-03-21 03:59:14 +00:00
Robert Muir dd7bfc78d9 LUCENE-3894: for tokenizers, add some tests for larger documents
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303258 13f79535-47bb-0310-9956-ffa450edef68
2012-03-21 02:54:07 +00:00
Robert Muir 3d73a3014e LUCENE-3896: beef up TestDuelingAnalyzers for larger documents
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303253 13f79535-47bb-0310-9956-ffa450edef68
2012-03-21 01:52:22 +00:00
Michael McCandless c20242721f LUCENE-3894: some tokenizers weren't reading all input chars
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303193 13f79535-47bb-0310-9956-ffa450edef68
2012-03-20 23:02:37 +00:00
Robert Muir b7a7e5a625 LUCENE-3889: remove unnecessary/unused base class
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303026 13f79535-47bb-0310-9956-ffa450edef68
2012-03-20 17:28:26 +00:00
Martijn van Groningen 7c4c592e05 LUCENE-3890: Fixed NPE for grouped faceting on multi-valued fields
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1303002 13f79535-47bb-0310-9956-ffa450edef68
2012-03-20 17:05:05 +00:00
Jan Høydahl 5648222e86 SOLR-2764: Fix testcase for minimal stemmer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302872 13f79535-47bb-0310-9956-ffa450edef68
2012-03-20 13:12:39 +00:00
Jan Høydahl 54d48eb98b SOLR-2764: Create a NorwegianLightStemmer and NorwegianMinimalStemmer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302833 13f79535-47bb-0310-9956-ffa450edef68
2012-03-20 10:57:50 +00:00
Robert Muir 790323780f basic javadocs improvements, mostly simple descriptions where the class had nothing before
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302752 13f79535-47bb-0310-9956-ffa450edef68
2012-03-20 02:09:25 +00:00
Robert Muir 4a2b1d974a javadocs: add missing package.htmls
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302713 13f79535-47bb-0310-9956-ffa450edef68
2012-03-19 23:20:25 +00:00
Shai Erera dd831066b2 remove another redundant throws FacetException from CategoryContainer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302378 13f79535-47bb-0310-9956-ffa450edef68
2012-03-19 11:29:52 +00:00
Shai Erera bfa06a9e30 remove redundant throws FacetException (as it's not really thrown)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302271 13f79535-47bb-0310-9956-ffa450edef68
2012-03-19 05:00:40 +00:00
Steven Rowe c4f72f61ac LUCENE-3880: UAX29URLEmailTokenizer now recognizes emails when the mailto: scheme is prepended.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302265 13f79535-47bb-0310-9956-ffa450edef68
2012-03-19 03:13:52 +00:00
Uwe Schindler c429736260 LUCENE-3867: Refactor RamUsageEstimator. CHANGES.txt will be added once backported to 3.x.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1302133 13f79535-47bb-0310-9956-ffa450edef68
2012-03-18 14:59:10 +00:00
Robert Muir 3d2d144f92 LUCENE-3848: don't produce tokenstreams that start with posinc=0
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1301478 13f79535-47bb-0310-9956-ffa450edef68
2012-03-16 13:06:30 +00:00
Martijn van Groningen 9a15b3f449 Rename docvalues to values
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1301238 13f79535-47bb-0310-9956-ffa450edef68
2012-03-15 22:09:05 +00:00
Martijn van Groningen aa83c232a7 Rename docValues to values
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1301236 13f79535-47bb-0310-9956-ffa450edef68
2012-03-15 22:07:50 +00:00
Martijn van Groningen 9276e43685 Rename docvalues to values
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1301234 13f79535-47bb-0310-9956-ffa450edef68
2012-03-15 22:06:22 +00:00
Martijn van Groningen 2e132e006a Rename docValues to values
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1301231 13f79535-47bb-0310-9956-ffa450edef68
2012-03-15 22:03:21 +00:00
Ryan McKinley 2d28a5e9a7 LUCENE-3795: return empty array for TwoDoublesStrategy
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1301206 13f79535-47bb-0310-9956-ffa450edef68
2012-03-15 21:16:04 +00:00
Martijn van Groningen 308974a36d LUCENE-3778: Fixed build
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1300932 13f79535-47bb-0310-9956-ffa450edef68
2012-03-15 11:42:32 +00:00
Martijn van Groningen 27fedb096b LUCENE-3856: Added docvalues based grouped facet collector.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1300860 13f79535-47bb-0310-9956-ffa450edef68
2012-03-15 09:31:06 +00:00