Commit Graph

97 Commits

Author SHA1 Message Date
Robert Muir 6386f77138 LUCENE-2911: synchronize grammar/token types across StandardTokenizer, UAX29EmailURLTokenizer, ICUTokenizer; add CJK types
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1068979 13f79535-47bb-0310-9956-ffa450edef68
2011-02-09 17:07:46 +00:00
Robert Muir 70a9910b38 LUCENE-2908: clean up serialization in the codebase
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1068526 13f79535-47bb-0310-9956-ffa450edef68
2011-02-08 19:05:28 +00:00
Doron Cohen 5ab6a5e7dd LUCENE-1540: Improvements to contrib.benchmark for TREC collections - bring back case insensitivity to path names using Locale.ENGLISH - port/merged from 3x r1067705.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1067772 13f79535-47bb-0310-9956-ffa450edef68
2011-02-06 21:25:53 +00:00
Shai Erera ece1524805 LUCENE-2609: Generate jar containing test classes (trunk)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1067738 13f79535-47bb-0310-9956-ffa450edef68
2011-02-06 19:48:54 +00:00
Doron Cohen 70cbc8acab LUCENE-1540: Improvements to contrib.benchmark for TREC collections - fix test failures in some locales due to toUpperCase() - port/merged from 3x.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1067705 13f79535-47bb-0310-9956-ffa450edef68
2011-02-06 17:18:53 +00:00
Doron Cohen 8c487e588c LUCENE-1540: Improvements to contrib.benchmark for TREC collections - port/merge from 3x.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1067359 13f79535-47bb-0310-9956-ffa450edef68
2011-02-05 00:35:09 +00:00
Koji Sekiguchi 6f31407109 SOLR-1057: Add PathHierarchyTokenizer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1067131 13f79535-47bb-0310-9956-ffa450edef68
2011-02-04 10:19:52 +00:00
Robert Muir dde8fc7020 LUCENE-2751: add LuceneTestCase.newSearcher. use this to get an indexsearcher that randomly uses threads, etc
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1066691 13f79535-47bb-0310-9956-ffa450edef68
2011-02-02 23:27:25 +00:00
Michael McCandless 62b692e9a3 LUCENE-2897: apply delete-by-term on flushed segment while we flush (still buffer delete-by-terms for past segments)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065855 13f79535-47bb-0310-9956-ffa450edef68
2011-01-31 23:35:02 +00:00
Michael McCandless c0b98f063a LUCENE-1591: rollback to old patched xercesImpl.jar to workaround XERCESJ-1257, which we hit on current Wikipedia XML export (enwiki-20110115-pages-articles.xml)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065719 13f79535-47bb-0310-9956-ffa450edef68
2011-01-31 19:20:34 +00:00
Robert Muir 5ccf063a5d LUCENE-2901: fix consistency of KeywordMarkerFilter, it should only set, not unset the attribute
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065621 13f79535-47bb-0310-9956-ffa450edef68
2011-01-31 14:06:45 +00:00
Robert Muir 107c06324b fix more javadocs warnings
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065474 13f79535-47bb-0310-9956-ffa450edef68
2011-01-31 02:59:40 +00:00
Uwe Schindler e7088279f7 LUCENE-1253: LengthFilter (and Solr's KeepWordTokenFilter) now require up front specification of enablePositionIncrement. Together with StopFilter they have a common base class (FilteringTokenFilter) that handles the position increments automatically. Implementors only need to override an accept() method that filters tokens
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065343 13f79535-47bb-0310-9956-ffa450edef68
2011-01-30 18:30:34 +00:00
Michael McCandless 277dfa0e88 LUCENE-2900: allow explicit control over whether deletes must be applied when pulling NRT reader
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065337 13f79535-47bb-0310-9956-ffa450edef68
2011-01-30 18:06:37 +00:00
Robert Muir d1a5ca1460 add missing @Override and @Deprecated annotations
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065304 13f79535-47bb-0310-9956-ffa450edef68
2011-01-30 15:10:15 +00:00
Yonik Seeley 6569aa5da3 add ASL
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065302 13f79535-47bb-0310-9956-ffa450edef68
2011-01-30 15:03:01 +00:00
Robert Muir 5629a2b96b add missing license headers where there are none, but the JIRA box was checked
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1065265 13f79535-47bb-0310-9956-ffa450edef68
2011-01-30 13:28:41 +00:00
Yonik Seeley 51dc4159e6 SOLR-1283: fix numRead counter that caused mark invalid exceptions
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1063920 13f79535-47bb-0310-9956-ffa450edef68
2011-01-26 23:40:08 +00:00
Shai Erera e76ad0990d LUCENE-929: contrib/benchmark build doesn't handle checking if content is properly extracted (trunk)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1063647 13f79535-47bb-0310-9956-ffa450edef68
2011-01-26 09:10:06 +00:00
Steven Rowe 1b44e0b9a5 added support for maven artifact generation of the new Solr UIMA contrib; the top-level get-maven-poms target now forces copying of all of the source pom.xml files, even if the source is not newer than the target files, so that version changes will always take effect when specified through the -Dversion ant cmdline option
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1062936 13f79535-47bb-0310-9956-ffa450edef68
2011-01-24 19:33:14 +00:00
Michael McCandless 3e8f55cd0d LUCENE-2885: add WaitForMerges tasks to benchmark
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1062879 13f79535-47bb-0310-9956-ffa450edef68
2011-01-24 17:00:50 +00:00
Steven Rowe 11146b8c3c changed generate-maven-artifacts target to place all maven artifacts in one place: modules/dist/maven/; added modules/dist/ to list of dirs to remove with the 'clean' target; added modules/dist/ to svn:ignore list on modules/
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1062308 13f79535-47bb-0310-9956-ffa450edef68
2011-01-23 01:42:19 +00:00
Steven Rowe 74360c80f5 LUCENE-2657: Replace Maven POM templates with full POMs
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1061613 13f79535-47bb-0310-9956-ffa450edef68
2011-01-21 03:44:13 +00:00
Uwe Schindler 460fa90564 LUCENE-2374: Added Attribute reflection API: It's now possible to inspect the contents of AttributeImpl and AttributeSource using a well-defined API
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1061039 13f79535-47bb-0310-9956-ffa450edef68
2011-01-19 22:41:16 +00:00
Yonik Seeley b2cad88aad SOLR-2316: fail early if synonym file not provided
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1060846 13f79535-47bb-0310-9956-ffa450edef68
2011-01-19 16:11:42 +00:00
Shai Erera 2a0484bd40 LUCENE-2295: remove maxFieldLength (trunk)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1060340 13f79535-47bb-0310-9956-ffa450edef68
2011-01-18 12:01:40 +00:00
Robert Muir 4249ef9644 LUCENE-2847: remove obselete warning
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1059719 13f79535-47bb-0310-9956-ffa450edef68
2011-01-17 01:43:37 +00:00
Steven Rowe 8d7d57abdc LUCENE-2847: Added ASL2 license to supplementary macros generator, and to the generated file, and set svn:eol-style to native for both of them.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1056014 13f79535-47bb-0310-9956-ffa450edef68
2011-01-06 19:15:21 +00:00
Robert Muir fbfb07d904 LUCENE-2842: avoid java6-only String.isEmpty in rule parser
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1055906 13f79535-47bb-0310-9956-ffa450edef68
2011-01-06 15:07:12 +00:00
Robert Muir 66d3f38d52 LUCENE-2842: missing eol-style
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1055893 13f79535-47bb-0310-9956-ffa450edef68
2011-01-06 14:33:35 +00:00
Robert Muir 61872be09d LUCENE-2842: add Galician analyzer, Portuguese RSLP
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1055892 13f79535-47bb-0310-9956-ffa450edef68
2011-01-06 14:30:37 +00:00
Steven Rowe 1b22e86417 LUCENE-2847: Support all of unicode, including supplementary code points above the basic multilingual plane, in StandardTokenizer and UAX29URLEmailTokenizer.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1055877 13f79535-47bb-0310-9956-ffa450edef68
2011-01-06 13:51:10 +00:00
Michael McCandless 87274d00ac LUCENE-2837: collapse Searcher/Searchable into IndexSearcher; remove contrib/remote, MultiSearcher; absorb ParallelMultiSearcher into IndexSearcher as optional ExecutorService to ctor
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1055416 13f79535-47bb-0310-9956-ffa450edef68
2011-01-05 11:16:40 +00:00
Robert Muir f012c2d44d LUCENE-2845: move contrib/benchmark to modules/benchmark
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1054995 13f79535-47bb-0310-9956-ffa450edef68
2011-01-04 12:28:10 +00:00
Robert Muir 8696f549d4 LUCENE-2020: Remove unused imports
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1052926 13f79535-47bb-0310-9956-ffa450edef68
2010-12-26 19:16:42 +00:00
Robert Muir 620b2a0619 LUCENE-2747: Deprecate/remove language-specific tokenizers in favor of StandardTokenizer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1043114 13f79535-47bb-0310-9956-ffa450edef68
2010-12-07 16:19:17 +00:00
Steven Rowe 2b9726ae81 LUCENE-2763: Swap URL+Email recognizing StandardTokenizer and UAX29Tokenizer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1043071 13f79535-47bb-0310-9956-ffa450edef68
2010-12-07 14:53:13 +00:00
Robert Muir f87ca310ec LUCENE-2797: Upgrade icu to 4.6
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1042185 13f79535-47bb-0310-9956-ffa450edef68
2010-12-04 14:08:03 +00:00
Robert Muir a58c26978f LUCENE-2781: drop deprecations from trunk
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1040463 13f79535-47bb-0310-9956-ffa450edef68
2010-11-30 11:22:39 +00:00
Robert Muir ff47493dbd fix bug where StandardFilter isn't respected
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1040390 13f79535-47bb-0310-9956-ffa450edef68
2010-11-30 02:44:47 +00:00
Robert Muir de3d057abc SOLR-2237: Added StempelPolishStemFilterFactory to contrib/analysis-extras
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1035996 13f79535-47bb-0310-9956-ffa450edef68
2010-11-17 12:26:15 +00:00
Uwe Schindler 6f230c5e08 revert changes (will come in 3.x)
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1029347 13f79535-47bb-0310-9956-ffa450edef68
2010-10-31 14:03:50 +00:00
Uwe Schindler 819344aeab LUCENE-2732: Fix charset problems in XML loading in HyphenationCompoundWordTokenFilter
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1029345 13f79535-47bb-0310-9956-ffa450edef68
2010-10-31 13:56:46 +00:00
Uwe Schindler 987f32849b LUCENE-2708: when a test Assume fails, display information, improved one
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1023312 13f79535-47bb-0310-9956-ffa450edef68
2010-10-16 15:43:11 +00:00
Steven Rowe 7f6dd505f1 LUCENE-2699: Update StandardTokenizer and UAX29Tokenizer to Unicode 6.0.0
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1022826 13f79535-47bb-0310-9956-ffa450edef68
2010-10-15 05:41:54 +00:00
Steven Rowe f9e4f551e2 LUCENE-1370: Added ShingleFilter option to output unigrams if no shingles can be generated.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1006187 13f79535-47bb-0310-9956-ffa450edef68
2010-10-09 16:55:23 +00:00
Robert Muir 6c361ace76 add javadocs target
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1005492 13f79535-47bb-0310-9956-ffa450edef68
2010-10-07 15:23:54 +00:00
Robert Muir 0f1f892316 add compile-test
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1005487 13f79535-47bb-0310-9956-ffa450edef68
2010-10-07 15:13:43 +00:00
Steven Rowe 42d5b585ce Ignore this test under IntelliJ, which can't use Ant's test file patterns (Test*.java,*Test.java) to ignore this test, and thinks it's a failure since no test methods can be found.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1004853 13f79535-47bb-0310-9956-ffa450edef68
2010-10-05 23:23:55 +00:00
Robert Muir 7c020e317a LUCENE-2683: upgrade icu libraries to 4.4.2
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1004335 13f79535-47bb-0310-9956-ffa450edef68
2010-10-04 17:53:41 +00:00