Commit Graph

404 Commits

Author SHA1 Message Date
Robert Muir a871b29ed6 LUCENE-3086: add ElisionFilter to ItalianAnalyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1102120 13f79535-47bb-0310-9956-ffa450edef68
2011-05-11 22:43:54 +00:00
Ryan McKinley 96878534a0 LUCENE-3071: Add ReversePathHierarchyTokenizer and enable skip on PathHierarchyTokenizer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1099999 13f79535-47bb-0310-9956-ffa450edef68
2011-05-05 23:30:05 +00:00
Robert Muir 4455345c6e LUCENE-3063: factor CharTokenizer/CharacterUtils into analyzers module
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1098871 13f79535-47bb-0310-9956-ffa450edef68
2011-05-03 00:29:47 +00:00
Robert Muir a75e5282c7 collation tests: try to find less jre bugs and just test thread safety
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1098532 13f79535-47bb-0310-9956-ffa450edef68
2011-05-02 12:03:14 +00:00
Robert Muir 1f67321074 missing svn:eol-style
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1097216 13f79535-47bb-0310-9956-ffa450edef68
2011-04-27 19:40:18 +00:00
Robert Muir 44ba0859db LUCENE-2560: stress tests for icu integration
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1096339 13f79535-47bb-0310-9956-ffa450edef68
2011-04-24 16:07:16 +00:00
Robert Muir 593d7a54ea LUCENE-3044: ThaiWordFilter uses AttributeSource.copyTo incorrectly
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1096334 13f79535-47bb-0310-9956-ffa450edef68
2011-04-24 15:45:45 +00:00
Robert Muir 7db98455e7 LUCENE-3043: GermanStemmer threw IOOBE on zero-length tokens
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1096194 13f79535-47bb-0310-9956-ffa450edef68
2011-04-23 17:48:17 +00:00
Robert Muir c0c695053c LUCENE-2560: remove copy/paste unused import
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1096183 13f79535-47bb-0310-9956-ffa450edef68
2011-04-23 17:16:51 +00:00
Robert Muir 68061ef921 LUCENE-2560: add basic stress tests for analyzers
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1096178 13f79535-47bb-0310-9956-ffa450edef68
2011-04-23 16:55:15 +00:00
Robert Muir c3f6331639 LUCENE-3016: add analyzer for Latvian
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1092396 13f79535-47bb-0310-9956-ffa450edef68
2011-04-14 17:07:10 +00:00
Robert Muir ecd795c585 LUCENE-3026: SmartChineseAnalyzer's WordTokenFilter threw NullPointerException on sentences longer than 32,767 characters
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1092328 13f79535-47bb-0310-9956-ffa450edef68
2011-04-14 15:15:31 +00:00
Robert Muir 52b54262dc LUCENE-3020: don't reflect mockanalyzer, it has no no-arg ctor anymore
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1091159 13f79535-47bb-0310-9956-ffa450edef68
2011-04-11 18:15:50 +00:00
Robert Muir 7d07d206b5 LUCENE-3020: better payload testing with mockanalyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1091132 13f79535-47bb-0310-9956-ffa450edef68
2011-04-11 17:20:31 +00:00
Steven Rowe c613d642a0 LUCENE-3006: specialized definition of javadoc.classpath is not required for building ICU analysis module's javadocs
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1088063 13f79535-47bb-0310-9956-ffa450edef68
2011-04-02 16:47:24 +00:00
Michael McCandless f10d92398b LUCENE-1076: new TieredMergePolicy
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1088051 13f79535-47bb-0310-9956-ffa450edef68
2011-04-02 15:47:12 +00:00
Steven Rowe 14eb02ffa4 LUCENE-3006: die javadoc warnings die (modules/ edition)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1087830 13f79535-47bb-0310-9956-ffa450edef68
2011-04-01 17:43:24 +00:00
Robert Muir d940c24c03 fix benchmark collation test to match reality
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1087548 13f79535-47bb-0310-9956-ffa450edef68
2011-04-01 01:58:35 +00:00
Robert Muir 74a065a57f fix collation benchmark to use byte terms
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1087529 13f79535-47bb-0310-9956-ffa450edef68
2011-04-01 00:47:16 +00:00
Steven Rowe 7402c50058 fix typo
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1087475 13f79535-47bb-0310-9956-ffa450edef68
2011-03-31 22:53:58 +00:00
Steven Rowe 085d30ecf3 changes entries for recent commits
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1087474 13f79535-47bb-0310-9956-ffa450edef68
2011-03-31 22:53:21 +00:00
Steven Rowe 1caaea77b2 ReadTokensTask now converts tokens to their indexed forms (char[]->byte[]), just as the indexer does. This allows measurement of the conversion process, which is important for analysis components that customize it, e.g. (ICU)CollationKeyFilter.
NB: as a result, benchmarks that incorporate this task will no longer be directly comparable between 3.X and 4.0

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1087471 13f79535-47bb-0310-9956-ffa450edef68
2011-03-31 22:44:20 +00:00
Steven Rowe 9cefe60a4b Removed special case for looking up KeywordAnalyzer, which is *not* alone among analyzers occupying package o.a.l.analysis.core.
Instead, now attempting to instantiate no-package analyzers as core analyzers, then falling back to the previous default package ("org.apache.lucene.analysis.") if that fails.  Also, made the same changes in NewShingleAnalyzerTask.

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1087468 13f79535-47bb-0310-9956-ffa450edef68
2011-03-31 22:34:46 +00:00
Steven Rowe 3bbfa450e4 Updated to the new method for obtaining a top-level deleted docs bitset. Also checking the bitset for null, when there are no deleted docs.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1087435 13f79535-47bb-0310-9956-ffa450edef68
2011-03-31 21:03:18 +00:00
Steven Rowe 56c2994f66 Added a special case for looking up KeywordAnalyzer, which alone among analyzers occupies package o.a.l.analysis.core.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1087431 13f79535-47bb-0310-9956-ffa450edef68
2011-03-31 20:16:10 +00:00
Doron Cohen 8d0c1b62af LUCENE-2977: WriteLineDocTask should write gzip/bzip2/txt according to the extension of specified output file name.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1084929 13f79535-47bb-0310-9956-ffa450edef68
2011-03-24 12:22:13 +00:00
Doron Cohen c6f3dd5cc7 LUCENE-2980: Benchmark's ContentSource made insensitive to letter case of file suffix - fix CHANGES entry.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1084549 13f79535-47bb-0310-9956-ffa450edef68
2011-03-23 11:47:18 +00:00
Doron Cohen d123b8a224 LUCENE-2980: Benchmark's ContentSource made insensitive to letter case of file suffix.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1084544 13f79535-47bb-0310-9956-ffa450edef68
2011-03-23 11:38:54 +00:00
Grant Ingersoll ed20a24d22 LUCENE-2952: restore src/tools and move validation there
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1084274 13f79535-47bb-0310-9956-ffa450edef68
2011-03-22 18:03:57 +00:00
Doron Cohen 97909a908e fix mis-spelled assert comment (again)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1084273 13f79535-47bb-0310-9956-ffa450edef68
2011-03-22 18:03:00 +00:00
Doron Cohen 1029aedcfd fix mis-spelled assert comment.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1084247 13f79535-47bb-0310-9956-ffa450edef68
2011-03-22 16:46:53 +00:00
Doron Cohen bb8e6ae846 LUCENE-2978: Upgrade benchmark's commons-compress from 1.0 to 1.1.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1084210 13f79535-47bb-0310-9956-ffa450edef68
2011-03-22 15:08:29 +00:00
Doron Cohen a9fda446c3 LUCENE-2958: WriteLineDocTask improvements - flexible line fields definition - port/merge from 3x.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1083816 13f79535-47bb-0310-9956-ffa450edef68
2011-03-21 14:59:42 +00:00
Robert Muir e67bf6b089 LUCENE-2944: fix BytesRef reuse bugs, TermToBytesRefAttribute owns the bytes like other attributes
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1083784 13f79535-47bb-0310-9956-ffa450edef68
2011-03-21 13:52:15 +00:00
Doron Cohen e45d28a8d3 LUCENE-2964: Allow benchmark tasks from alternative packages - merge/port from 3x.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1083731 13f79535-47bb-0310-9956-ffa450edef68
2011-03-21 11:23:37 +00:00
Doron Cohen 6d47d7377d LUCENE-2963: Easier way to run benchmark, by calling Benmchmark.exec(alg-file) - port from 3x.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1083557 13f79535-47bb-0310-9956-ffa450edef68
2011-03-20 20:12:39 +00:00
Grant Ingersoll 8bee953057 LUCENE-2952: drop dev-tools dependency, move to test framework, split out checking to each area: lucene, modules, solr
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1083010 13f79535-47bb-0310-9956-ffa450edef68
2011-03-18 18:40:02 +00:00
Grant Ingersoll f36c32405d LUCENE-2952: hook in dependency checking for license, notice
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1082535 13f79535-47bb-0310-9956-ffa450edef68
2011-03-17 15:34:21 +00:00
Grant Ingersoll 746d0ef5a0 LUCENE-2952: add notices
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1082516 13f79535-47bb-0310-9956-ffa450edef68
2011-03-17 15:00:51 +00:00
Grant Ingersoll 372fa574f9 remove unneeded license, notice
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1082512 13f79535-47bb-0310-9956-ffa450edef68
2011-03-17 14:55:34 +00:00
Grant Ingersoll 9352885d1b LUCENE-2952: normalize license files
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1081792 13f79535-47bb-0310-9956-ffa450edef68
2011-03-15 14:07:16 +00:00
Steven Rowe 7180bb3cb9 LUCENE-2957: generate-maven-artifacts target should include all non-Mavenized Lucene & Solr dependencies
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1080443 13f79535-47bb-0310-9956-ffa450edef68
2011-03-11 04:32:14 +00:00
Steven Rowe 77371e0433 Obsolete - replaced by apache-extras luceneutil
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1080335 13f79535-47bb-0310-9956-ffa450edef68
2011-03-10 20:23:02 +00:00
Steven Rowe 3fcf6d6525 LUCENE-2961: Remove benchmark/lib/xml-apis-2.9.0.jar - JVM 1.5+ contains these JAXP 1.3 interface classes
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1080258 13f79535-47bb-0310-9956-ffa450edef68
2011-03-10 15:57:52 +00:00
Uwe Schindler bdaa02c3c0 LUCENE-2953: PriorityQueue's internal heap was made private final
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1079707 13f79535-47bb-0310-9956-ffa450edef68
2011-03-09 09:18:56 +00:00
Robert Muir 52fbd34849 clear java 1.5-only javadocs warnings
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1078639 13f79535-47bb-0310-9956-ffa450edef68
2011-03-07 00:55:32 +00:00
Robert Muir b2fcee9822 add missing LICENSE/NOTICE to benchmarks module
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1078565 13f79535-47bb-0310-9956-ffa450edef68
2011-03-06 20:48:10 +00:00
Robert Muir 48dbe35e69 correct minor problems with dates and copyright owners in NOTICE.txts
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1078529 13f79535-47bb-0310-9956-ffa450edef68
2011-03-06 18:22:48 +00:00
Robert Muir 28ea4b7561 add xyz-LICENSE.txt for all third party jars
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1078512 13f79535-47bb-0310-9956-ffa450edef68
2011-03-06 16:50:22 +00:00
Robert Muir d51068ffd6 LUCENE-2894: apply formatting to more code samples
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1076237 13f79535-47bb-0310-9956-ffa450edef68
2011-03-02 14:59:02 +00:00
Robert Muir 6600f5acdf LUCENE-2943: tone down test even more
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1076223 13f79535-47bb-0310-9956-ffa450edef68
2011-03-02 13:56:15 +00:00
Robert Muir 7e5d696d7d LUCENE-2943: tone down test with multiplier a bit
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1075999 13f79535-47bb-0310-9956-ffa450edef68
2011-03-01 19:53:05 +00:00
Robert Muir 2509d35c11 LUCENE-2943: fix thread-safety issues with ICU collation
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1075850 13f79535-47bb-0310-9956-ffa450edef68
2011-03-01 15:47:14 +00:00
Robert Muir 308e0bd4a9 LUCENE-2514, LUCENE-2551: collation uses byte[] keys, deprecate old unscalable locale sort/range, termrangequery/filter work on bytes
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1075210 13f79535-47bb-0310-9956-ffa450edef68
2011-02-28 05:15:50 +00:00
Steven Rowe 88caf3a6f6 LUCENE-2923: Restore benchmark jar production for generate-maven-artifacts target
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1072393 13f79535-47bb-0310-9956-ffa450edef68
2011-02-19 17:06:01 +00:00
Steven Rowe f7b037d3cf LUCENE-2923: Cleanup contrib/demo
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1072250 13f79535-47bb-0310-9956-ffa450edef68
2011-02-19 04:49:36 +00:00
Uwe Schindler 5691bea096 LUCENE-2920: Removed ShingleMatrixFilter as it is unmaintained and does not work with custom Attributes or custom payload encoders
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1070821 13f79535-47bb-0310-9956-ffa450edef68
2011-02-15 09:24:06 +00:00
Robert Muir 6386f77138 LUCENE-2911: synchronize grammar/token types across StandardTokenizer, UAX29EmailURLTokenizer, ICUTokenizer; add CJK types
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1068979 13f79535-47bb-0310-9956-ffa450edef68
2011-02-09 17:07:46 +00:00
Robert Muir 70a9910b38 LUCENE-2908: clean up serialization in the codebase
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1068526 13f79535-47bb-0310-9956-ffa450edef68
2011-02-08 19:05:28 +00:00
Doron Cohen 5ab6a5e7dd LUCENE-1540: Improvements to contrib.benchmark for TREC collections - bring back case insensitivity to path names using Locale.ENGLISH - port/merged from 3x r1067705.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1067772 13f79535-47bb-0310-9956-ffa450edef68
2011-02-06 21:25:53 +00:00
Shai Erera ece1524805 LUCENE-2609: Generate jar containing test classes (trunk)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1067738 13f79535-47bb-0310-9956-ffa450edef68
2011-02-06 19:48:54 +00:00
Doron Cohen 70cbc8acab LUCENE-1540: Improvements to contrib.benchmark for TREC collections - fix test failures in some locales due to toUpperCase() - port/merged from 3x.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1067705 13f79535-47bb-0310-9956-ffa450edef68
2011-02-06 17:18:53 +00:00
Doron Cohen 8c487e588c LUCENE-1540: Improvements to contrib.benchmark for TREC collections - port/merge from 3x.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1067359 13f79535-47bb-0310-9956-ffa450edef68
2011-02-05 00:35:09 +00:00
Koji Sekiguchi 6f31407109 SOLR-1057: Add PathHierarchyTokenizer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1067131 13f79535-47bb-0310-9956-ffa450edef68
2011-02-04 10:19:52 +00:00
Robert Muir dde8fc7020 LUCENE-2751: add LuceneTestCase.newSearcher. use this to get an indexsearcher that randomly uses threads, etc
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1066691 13f79535-47bb-0310-9956-ffa450edef68
2011-02-02 23:27:25 +00:00
Michael McCandless 62b692e9a3 LUCENE-2897: apply delete-by-term on flushed segment while we flush (still buffer delete-by-terms for past segments)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065855 13f79535-47bb-0310-9956-ffa450edef68
2011-01-31 23:35:02 +00:00
Michael McCandless c0b98f063a LUCENE-1591: rollback to old patched xercesImpl.jar to workaround XERCESJ-1257, which we hit on current Wikipedia XML export (enwiki-20110115-pages-articles.xml)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065719 13f79535-47bb-0310-9956-ffa450edef68
2011-01-31 19:20:34 +00:00
Robert Muir 5ccf063a5d LUCENE-2901: fix consistency of KeywordMarkerFilter, it should only set, not unset the attribute
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065621 13f79535-47bb-0310-9956-ffa450edef68
2011-01-31 14:06:45 +00:00
Robert Muir 107c06324b fix more javadocs warnings
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065474 13f79535-47bb-0310-9956-ffa450edef68
2011-01-31 02:59:40 +00:00
Uwe Schindler e7088279f7 LUCENE-1253: LengthFilter (and Solr's KeepWordTokenFilter) now require up front specification of enablePositionIncrement. Together with StopFilter they have a common base class (FilteringTokenFilter) that handles the position increments automatically. Implementors only need to override an accept() method that filters tokens
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065343 13f79535-47bb-0310-9956-ffa450edef68
2011-01-30 18:30:34 +00:00
Michael McCandless 277dfa0e88 LUCENE-2900: allow explicit control over whether deletes must be applied when pulling NRT reader
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065337 13f79535-47bb-0310-9956-ffa450edef68
2011-01-30 18:06:37 +00:00
Robert Muir d1a5ca1460 add missing @Override and @Deprecated annotations
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065304 13f79535-47bb-0310-9956-ffa450edef68
2011-01-30 15:10:15 +00:00
Yonik Seeley 6569aa5da3 add ASL
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065302 13f79535-47bb-0310-9956-ffa450edef68
2011-01-30 15:03:01 +00:00
Robert Muir 5629a2b96b add missing license headers where there are none, but the JIRA box was checked
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065265 13f79535-47bb-0310-9956-ffa450edef68
2011-01-30 13:28:41 +00:00
Yonik Seeley 51dc4159e6 SOLR-1283: fix numRead counter that caused mark invalid exceptions
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1063920 13f79535-47bb-0310-9956-ffa450edef68
2011-01-26 23:40:08 +00:00
Shai Erera e76ad0990d LUCENE-929: contrib/benchmark build doesn't handle checking if content is properly extracted (trunk)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1063647 13f79535-47bb-0310-9956-ffa450edef68
2011-01-26 09:10:06 +00:00
Steven Rowe 1b44e0b9a5 added support for maven artifact generation of the new Solr UIMA contrib; the top-level get-maven-poms target now forces copying of all of the source pom.xml files, even if the source is not newer than the target files, so that version changes will always take effect when specified through the -Dversion ant cmdline option
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1062936 13f79535-47bb-0310-9956-ffa450edef68
2011-01-24 19:33:14 +00:00
Michael McCandless 3e8f55cd0d LUCENE-2885: add WaitForMerges tasks to benchmark
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1062879 13f79535-47bb-0310-9956-ffa450edef68
2011-01-24 17:00:50 +00:00
Steven Rowe 11146b8c3c changed generate-maven-artifacts target to place all maven artifacts in one place: modules/dist/maven/; added modules/dist/ to list of dirs to remove with the 'clean' target; added modules/dist/ to svn:ignore list on modules/
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1062308 13f79535-47bb-0310-9956-ffa450edef68
2011-01-23 01:42:19 +00:00
Steven Rowe 74360c80f5 LUCENE-2657: Replace Maven POM templates with full POMs
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1061613 13f79535-47bb-0310-9956-ffa450edef68
2011-01-21 03:44:13 +00:00
Uwe Schindler 460fa90564 LUCENE-2374: Added Attribute reflection API: It's now possible to inspect the contents of AttributeImpl and AttributeSource using a well-defined API
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1061039 13f79535-47bb-0310-9956-ffa450edef68
2011-01-19 22:41:16 +00:00
Yonik Seeley b2cad88aad SOLR-2316: fail early if synonym file not provided
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1060846 13f79535-47bb-0310-9956-ffa450edef68
2011-01-19 16:11:42 +00:00
Shai Erera 2a0484bd40 LUCENE-2295: remove maxFieldLength (trunk)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1060340 13f79535-47bb-0310-9956-ffa450edef68
2011-01-18 12:01:40 +00:00
Robert Muir 4249ef9644 LUCENE-2847: remove obselete warning
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1059719 13f79535-47bb-0310-9956-ffa450edef68
2011-01-17 01:43:37 +00:00
Steven Rowe 8d7d57abdc LUCENE-2847: Added ASL2 license to supplementary macros generator, and to the generated file, and set svn:eol-style to native for both of them.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1056014 13f79535-47bb-0310-9956-ffa450edef68
2011-01-06 19:15:21 +00:00
Robert Muir fbfb07d904 LUCENE-2842: avoid java6-only String.isEmpty in rule parser
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1055906 13f79535-47bb-0310-9956-ffa450edef68
2011-01-06 15:07:12 +00:00
Robert Muir 66d3f38d52 LUCENE-2842: missing eol-style
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1055893 13f79535-47bb-0310-9956-ffa450edef68
2011-01-06 14:33:35 +00:00
Robert Muir 61872be09d LUCENE-2842: add Galician analyzer, Portuguese RSLP
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1055892 13f79535-47bb-0310-9956-ffa450edef68
2011-01-06 14:30:37 +00:00
Steven Rowe 1b22e86417 LUCENE-2847: Support all of unicode, including supplementary code points above the basic multilingual plane, in StandardTokenizer and UAX29URLEmailTokenizer.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1055877 13f79535-47bb-0310-9956-ffa450edef68
2011-01-06 13:51:10 +00:00
Michael McCandless 87274d00ac LUCENE-2837: collapse Searcher/Searchable into IndexSearcher; remove contrib/remote, MultiSearcher; absorb ParallelMultiSearcher into IndexSearcher as optional ExecutorService to ctor
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1055416 13f79535-47bb-0310-9956-ffa450edef68
2011-01-05 11:16:40 +00:00
Robert Muir f012c2d44d LUCENE-2845: move contrib/benchmark to modules/benchmark
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1054995 13f79535-47bb-0310-9956-ffa450edef68
2011-01-04 12:28:10 +00:00
Robert Muir 8696f549d4 LUCENE-2020: Remove unused imports
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1052926 13f79535-47bb-0310-9956-ffa450edef68
2010-12-26 19:16:42 +00:00
Robert Muir 620b2a0619 LUCENE-2747: Deprecate/remove language-specific tokenizers in favor of StandardTokenizer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1043114 13f79535-47bb-0310-9956-ffa450edef68
2010-12-07 16:19:17 +00:00
Steven Rowe 2b9726ae81 LUCENE-2763: Swap URL+Email recognizing StandardTokenizer and UAX29Tokenizer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1043071 13f79535-47bb-0310-9956-ffa450edef68
2010-12-07 14:53:13 +00:00
Robert Muir f87ca310ec LUCENE-2797: Upgrade icu to 4.6
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1042185 13f79535-47bb-0310-9956-ffa450edef68
2010-12-04 14:08:03 +00:00
Robert Muir a58c26978f LUCENE-2781: drop deprecations from trunk
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1040463 13f79535-47bb-0310-9956-ffa450edef68
2010-11-30 11:22:39 +00:00
Robert Muir ff47493dbd fix bug where StandardFilter isn't respected
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1040390 13f79535-47bb-0310-9956-ffa450edef68
2010-11-30 02:44:47 +00:00
Robert Muir de3d057abc SOLR-2237: Added StempelPolishStemFilterFactory to contrib/analysis-extras
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1035996 13f79535-47bb-0310-9956-ffa450edef68
2010-11-17 12:26:15 +00:00
Uwe Schindler 6f230c5e08 revert changes (will come in 3.x)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1029347 13f79535-47bb-0310-9956-ffa450edef68
2010-10-31 14:03:50 +00:00
Uwe Schindler 819344aeab LUCENE-2732: Fix charset problems in XML loading in HyphenationCompoundWordTokenFilter
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1029345 13f79535-47bb-0310-9956-ffa450edef68
2010-10-31 13:56:46 +00:00
Uwe Schindler 987f32849b LUCENE-2708: when a test Assume fails, display information, improved one
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1023312 13f79535-47bb-0310-9956-ffa450edef68
2010-10-16 15:43:11 +00:00
Steven Rowe 7f6dd505f1 LUCENE-2699: Update StandardTokenizer and UAX29Tokenizer to Unicode 6.0.0
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1022826 13f79535-47bb-0310-9956-ffa450edef68
2010-10-15 05:41:54 +00:00
Steven Rowe f9e4f551e2 LUCENE-1370: Added ShingleFilter option to output unigrams if no shingles can be generated.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1006187 13f79535-47bb-0310-9956-ffa450edef68
2010-10-09 16:55:23 +00:00
Robert Muir 6c361ace76 add javadocs target
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1005492 13f79535-47bb-0310-9956-ffa450edef68
2010-10-07 15:23:54 +00:00
Robert Muir 0f1f892316 add compile-test
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1005487 13f79535-47bb-0310-9956-ffa450edef68
2010-10-07 15:13:43 +00:00
Steven Rowe 42d5b585ce Ignore this test under IntelliJ, which can't use Ant's test file patterns (Test*.java,*Test.java) to ignore this test, and thinks it's a failure since no test methods can be found.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1004853 13f79535-47bb-0310-9956-ffa450edef68
2010-10-05 23:23:55 +00:00
Robert Muir 7c020e317a LUCENE-2683: upgrade icu libraries to 4.4.2
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1004335 13f79535-47bb-0310-9956-ffa450edef68
2010-10-04 17:53:41 +00:00
Robert Muir 98621382be clear up 1.5-only javadocs warnings
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1004215 13f79535-47bb-0310-9956-ffa450edef68
2010-10-04 12:03:51 +00:00
Robert Muir afad8123d2 clear up more warnings in modules/contrib
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1003990 13f79535-47bb-0310-9956-ffa450edef68
2010-10-03 16:27:34 +00:00
Robert Muir 0789e5f4e7 LUCENE-2681: fix generics violations in contrib/modules
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1003978 13f79535-47bb-0310-9956-ffa450edef68
2010-10-03 15:41:57 +00:00
Robert Muir 85a27b8b38 clear up javadocs warnings/errors (forgot to svn add these overview.htmls)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1003965 13f79535-47bb-0310-9956-ffa450edef68
2010-10-03 13:30:29 +00:00
Robert Muir e05117884a clear up javadocs warnings/errors
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1003962 13f79535-47bb-0310-9956-ffa450edef68
2010-10-03 13:22:51 +00:00
Robert Muir c8b7a21b4b clear up more compiler warnings
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1003906 13f79535-47bb-0310-9956-ffa450edef68
2010-10-02 22:20:26 +00:00
Robert Muir fd11477ece clean up some fallthru/deprecation warnings
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1003873 13f79535-47bb-0310-9956-ffa450edef68
2010-10-02 19:58:35 +00:00
Robert Muir f5031a6b27 LUCENE-2167: cut over these analyzers also
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1002221 13f79535-47bb-0310-9956-ffa450edef68
2010-09-28 15:33:22 +00:00
Steven Rowe 3c26a9167c LUCENE-2167: Implement StandardTokenizer with the UAX#29 Standard
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1002032 13f79535-47bb-0310-9956-ffa450edef68
2010-09-28 06:16:16 +00:00
Robert Muir cce20cd820 LUCENE-2070: document how LengthFilter counts characters
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1000675 13f79535-47bb-0310-9956-ffa450edef68
2010-09-24 00:42:05 +00:00
Robert Muir c84bd2f1ec LUCENE-2653: ThaiAnalyzer assumes things about your jre
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@998684 13f79535-47bb-0310-9956-ffa450edef68
2010-09-19 15:40:06 +00:00
Robert Muir 774eaeada0 LUCENE-2630: fix intl test bugs that rely on cldr version
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@997180 13f79535-47bb-0310-9956-ffa450edef68
2010-09-15 03:30:35 +00:00
Robert Muir feabadea20 LUCENE-2642: merge LuceneTestCase and LuceneTestCaseJ4
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@996611 13f79535-47bb-0310-9956-ffa450edef68
2010-09-13 17:37:20 +00:00
Robert Muir d38ec19a28 LUCENE-2639: remove random juggling in tests, add -Dtests.seed
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@995772 13f79535-47bb-0310-9956-ffa450edef68
2010-09-10 12:34:18 +00:00
Robert Muir 912a6152a8 LUCENE-2629: fix analysis/icu's gennorm2 task
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@991053 13f79535-47bb-0310-9956-ffa450edef68
2010-08-31 01:33:02 +00:00
Robert Muir 13fd70521a LUCENE-2624: add armenian, basque, catalan analyzers from snowball
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@990459 13f79535-47bb-0310-9956-ffa450edef68
2010-08-28 22:42:25 +00:00
Robert Muir 33cc5a041e SOLR-2059: Add types attribute to WordDelimiterFilterFactory
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@990451 13f79535-47bb-0310-9956-ffa450edef68
2010-08-28 21:25:44 +00:00
Robert Muir 48dde8359f LUCENE-2098: speed up BaseCharFilter
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@990161 13f79535-47bb-0310-9956-ffa450edef68
2010-08-27 14:33:22 +00:00
Robert Muir 07df8d5210 LUCENE-2598: factor the behavior of MockRAMDirectory into MockDirectoryWrapper, add experimental -Dtests.directory= to allow running the tests under different directory impls [but the default is still RAMDirectory]
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@988206 13f79535-47bb-0310-9956-ffa450edef68
2010-08-23 17:00:43 +00:00
Ryan McKinley c31c4b63d1 even more pom fixes
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@987378 13f79535-47bb-0310-9956-ffa450edef68
2010-08-20 04:11:50 +00:00
Ryan McKinley 3be9fedd84 getting 'generate-maven-artifacts' to work with analysis module
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@987286 13f79535-47bb-0310-9956-ffa450edef68
2010-08-19 19:58:36 +00:00
Robert Muir 1473b59c0e SOLR-1860: expose these analyzers stoplists as .txt like the others
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@986612 13f79535-47bb-0310-9956-ffa450edef68
2010-08-18 09:59:00 +00:00
Robert Muir faed4b4cd0 LUCENE-2598: add newDirectory and track that resources are closed correctly by tests
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@984968 13f79535-47bb-0310-9956-ffa450edef68
2010-08-12 20:56:23 +00:00
Robert Muir 61954ca249 SOLR-2002: change tests from TestCase to LuceneTestCase for better coverage
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@983530 13f79535-47bb-0310-9956-ffa450edef68
2010-08-09 06:11:16 +00:00
Shai Erera bed729c561 LUCENE-2570: Some improvements to _TestUtil and its usage
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@979646 13f79535-47bb-0310-9956-ffa450edef68
2010-07-27 11:31:25 +00:00
Robert Muir fcc9a4a3c3 LUCENE-2503: add forgotten javadoc/citation (sorry)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@964054 13f79535-47bb-0310-9956-ffa450edef68
2010-07-14 14:06:06 +00:00
Robert Muir 3241eb9291 LUCENE-2503: add light stemmers for european languages
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@964019 13f79535-47bb-0310-9956-ffa450edef68
2010-07-14 12:10:34 +00:00
Robert Muir 8f71031ac8 LUCENE-2413: consolidate remaining solr tokenstreams into modules/analysis
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@957162 13f79535-47bb-0310-9956-ffa450edef68
2010-06-23 11:25:17 +00:00
Michael McCandless c91bddb26b LUCENE-2380: hard cutover of all preflex APIs to flex
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@955257 13f79535-47bb-0310-9956-ffa450edef68
2010-06-16 15:17:32 +00:00
Robert Muir 5a661500c1 LUCENE-2413: directory and package fixes
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@955203 13f79535-47bb-0310-9956-ffa450edef68
2010-06-16 11:33:29 +00:00
Robert Muir 6e51a53189 LUCENE-2372: remove unused import
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@950042 13f79535-47bb-0310-9956-ffa450edef68
2010-06-01 12:42:30 +00:00
Robert Muir ad0e495911 LUCENE-2372: switch over remaining uses of TermAttribute
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@950008 13f79535-47bb-0310-9956-ffa450edef68
2010-06-01 10:35:13 +00:00
Uwe Schindler 98b252ed7f LUCENE-2295: Added a LimitTokenCountAnalyzer / LimitTokenCountFilter to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter. This patch also fixes a bug in the offset calculation in CharTokenizer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@949445 13f79535-47bb-0310-9956-ffa450edef68
2010-05-29 23:14:18 +00:00
Uwe Schindler 9e61dd591f Generics Policeman ticket
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@948234 13f79535-47bb-0310-9956-ffa450edef68
2010-05-25 22:44:36 +00:00
Robert Muir a0c72afb31 LUCENE-2413: move more core analysis to analyzers module
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@948225 13f79535-47bb-0310-9956-ffa450edef68
2010-05-25 22:28:32 +00:00
Robert Muir 71b59ca566 LUCENE-2413: consolidate remaining concrete core analyzers to modules/analysis
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@948195 13f79535-47bb-0310-9956-ffa450edef68
2010-05-25 20:16:44 +00:00
Robert Muir 5259d7d90b LUCENE-2413: move KeywordMarkerFilter to analyzers module
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@946621 13f79535-47bb-0310-9956-ffa450edef68
2010-05-20 13:23:12 +00:00
Robert Muir 5ccb3ae286 LUCENE-2413: fold contrib/icu into analyzers module
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@946590 13f79535-47bb-0310-9956-ffa450edef68
2010-05-20 10:46:00 +00:00
Robert Muir fe5f1aabcb LUCENE-1287: Allow usage of HyphenationCompoundWordTokenFilter without a dictionary
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@946139 13f79535-47bb-0310-9956-ffa450edef68
2010-05-19 11:58:37 +00:00
Uwe Schindler cd45643b96 LUCENE-2384: Remove hack, as JFlex trunk now has the zzBuffer bug fixed
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@945130 13f79535-47bb-0310-9956-ffa450edef68
2010-05-17 13:13:10 +00:00
Robert Muir acbf053b7c LUCENE-2463: Improve Greek analysis
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@945090 13f79535-47bb-0310-9956-ffa450edef68
2010-05-17 11:28:04 +00:00
Robert Muir 26b9faddb2 LUCENE-2413: consolidate SynonymFilter into analyzers module
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@942827 13f79535-47bb-0310-9956-ffa450edef68
2010-05-10 17:37:45 +00:00
Robert Muir 399d373089 fix the compile target... this never worked for contrib/analyzers before either
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@942262 13f79535-47bb-0310-9956-ffa450edef68
2010-05-07 22:51:45 +00:00
Robert Muir 1b020be130 LUCENE-2437: Indonesian Analyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@942235 13f79535-47bb-0310-9956-ffa450edef68
2010-05-07 21:21:12 +00:00
Robert Muir 1e1296e6f8 sync all changes to reflect reality
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@941710 13f79535-47bb-0310-9956-ffa450edef68
2010-05-06 13:08:59 +00:00
Robert Muir bef21b3e18 LUCENE-2444: boilerplate stuff for the analyzers module
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@941369 13f79535-47bb-0310-9956-ffa450edef68
2010-05-05 16:27:58 +00:00
Robert Muir f6e9cc9f32 LUCENE-2444: move contrib/analyzers to modules/analysis
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@941308 13f79535-47bb-0310-9956-ffa450edef68
2010-05-05 14:26:59 +00:00