Commit Graph

659 Commits

Author SHA1 Message Date
Michael Busch 537aeb24e0 LUCENE-1759: Set final offset correctly in contrib TokenStreams.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@799968 13f79535-47bb-0310-9956-ffa450edef68
2009-08-02 02:10:46 +00:00
Michael Busch 1743081b07 LUCENE-1460: Changed TokenStreams/TokenFilters in contrib to use the new TokenStream API.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@799953 13f79535-47bb-0310-9956-ffa450edef68
2009-08-01 22:52:32 +00:00
Mark Robert Miller 3e869d9336 remove system.out and unnecessary next() in tokenstream
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@799865 13f79535-47bb-0310-9956-ffa450edef68
2009-08-01 14:18:19 +00:00
Michael McCandless 175e8b546d LUCENE-1763: require IndexWriter be passed up front to the MergePolicy
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@799818 13f79535-47bb-0310-9956-ffa450edef68
2009-08-01 09:22:25 +00:00
Michael McCandless bbcab117d9 LUCENE-1683: fixed JavaUtilRegexCapabilities (used by RegexQuery) to match entire string not just prefix
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@799678 13f79535-47bb-0310-9956-ffa450edef68
2009-07-31 18:02:56 +00:00
Michael McCandless 0b0d13dffe LUCENE-1745: allow passing matching flags to the underlying regexp engine
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@799667 13f79535-47bb-0310-9956-ffa450edef68
2009-07-31 17:41:04 +00:00
Uwe Schindler f8b2f0122c Use the empty docidset provided by DocIdSet.EMPTY_DOCIDSET
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@799582 13f79535-47bb-0310-9956-ffa450edef68
2009-07-31 11:32:37 +00:00
Mark Robert Miller f73a4f4324 LUCENE-1695: Update the Highlighter to use the new TokenStream API. This issue breaks backwards compatibility with some public classes. If you have implemented custom Fregmenters or Scorers, you will need to adjust them to work with the new TokenStream API. Rather than getting passed a Token at a time, you will be given a TokenStream to init your impl with - store the Attributes you are interested in locally and access them on each call to the method that used to pass a new Token. Look at the included updated impls for examples.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@799455 13f79535-47bb-0310-9956-ffa450edef68
2009-07-30 22:00:47 +00:00
Mark Robert Miller 7ecaa8c990 wikipedia-flush-by-RAM.alg should use content.source
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@799379 13f79535-47bb-0310-9956-ffa450edef68
2009-07-30 17:35:10 +00:00
Mark Robert Miller e505413fae wikipedia.alg should use content.source
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@799377 13f79535-47bb-0310-9956-ffa450edef68
2009-07-30 17:34:28 +00:00
Mark Robert Miller afb517e832 LUCENE-1752: Missing highlights when terms were repeated in separate, nested, boolean or disjunction queries.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@798976 13f79535-47bb-0310-9956-ffa450edef68
2009-07-29 16:47:05 +00:00
Michael McCandless dbff1fc9b5 LUCENE-1754: just use EMPTY_DOCIDSET.iterator() instead of new EmptyDocIdSetIterator
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@798102 13f79535-47bb-0310-9956-ffa450edef68
2009-07-27 11:12:36 +00:00
Michael McCandless 094c674c4d LUCENE-1595: don't use SortField.AUTO; deprecate LineDocMaker & EnwikiDocMaker
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@798096 13f79535-47bb-0310-9956-ffa450edef68
2009-07-27 10:15:03 +00:00
Michael McCandless 26a2c427d1 LUCENE-1754: BooleanQuery detects up front if it won't match any docs and returns null from its scorer() instead of NonMatchingScorer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@798086 13f79535-47bb-0310-9956-ffa450edef68
2009-07-27 09:50:02 +00:00
Michael McCandless 228888a882 LUCENE-1644: fix highlighter to rewrite MTQ whenever it's not already a SCORING_BOOLEAN_QUERY
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@797728 13f79535-47bb-0310-9956-ffa450edef68
2009-07-25 09:31:17 +00:00
Michael McCandless be66120dff LUCENE-1644: enable different rewrite methods for MultiTermQuery
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@797694 13f79535-47bb-0310-9956-ffa450edef68
2009-07-25 00:03:33 +00:00
Otis Gospodnetic f758b4d259 - Typo
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@797310 13f79535-47bb-0310-9956-ffa450edef68
2009-07-24 02:43:58 +00:00
Simon Willnauer 999f6157c7 LUCENE-1728: Splitted contrib/analyzers into common and smartcn. Smartcn depends on a large dictionary that causes the analyzers jar to grow up to 3MB compressed size.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@797150 13f79535-47bb-0310-9956-ffa450edef68
2009-07-23 17:11:22 +00:00
Mark Robert Miller 3adc61c3ac LUCENE-1755: Fix WriteLineDocTask to output a document if it contains either a title or body (or both).
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@795792 13f79535-47bb-0310-9956-ffa450edef68
2009-07-20 12:19:06 +00:00
Grant Ingersoll 63402f49c7 Javadoc updates
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@795553 13f79535-47bb-0310-9956-ffa450edef68
2009-07-19 15:06:57 +00:00
Michael McCandless c79f54975e LUCENE-1505: switch local lucene to use trie's NumericUtils for mapping doubles to strings
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@794721 13f79535-47bb-0310-9956-ffa450edef68
2009-07-16 15:38:06 +00:00
Mark Robert Miller add56f5e66 LUCENE-1725: Fix the example Sort algorithm - auto is now deprecated and no longer works with Benchmark. Benchmark will now throw an exception if you specify sort fields without a type. The example sort algorithm is now typed.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@794109 13f79535-47bb-0310-9956-ffa450edef68
2009-07-14 22:52:58 +00:00
Mark Robert Miller ea7e4ad344 LUCENE-1688: Deprecate static final String stop word array in and StopAnalzyer and replace it with an immutable implementation of CharArraySet.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@794078 13f79535-47bb-0310-9956-ffa450edef68
2009-07-14 21:39:22 +00:00
Otis Gospodnetic b393e4d0af LUCENE-1491 - EdgeNGramTokenFilter no longer stops on tokens shorter than minimum n-gram size.
- line, and those below, will be ignored--

M    CHANGES.txt
M    analyzers/src/test/org/apache/lucene/analysis/ngram/EdgeNGramTokenFilterTest.java
M    analyzers/src/test/org/apache/lucene/analysis/ngram/NGramTokenFilterTest.java
M    analyzers/src/java/org/apache/lucene/analysis/ngram/EdgeNGramTokenFilter.java
M    analyzers/src/java/org/apache/lucene/analysis/ngram/NGramTokenFilter.java


git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@794034 13f79535-47bb-0310-9956-ffa450edef68
2009-07-14 19:44:52 +00:00
Michael McCandless 65494af827 LUCENE-1272: add MoreLikeThis.set/getBoost
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@793973 13f79535-47bb-0310-9956-ffa450edef68
2009-07-14 16:56:16 +00:00
Michael McCandless 91aedd6685 LUCENE-1740: add 'analyzer' command to Lucli, to change analyzer from the default StandardAnalyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@793526 13f79535-47bb-0310-9956-ffa450edef68
2009-07-13 10:06:01 +00:00
Michael McCandless 9cbe5f4ff4 LUCENE-1522: adding new Fast Vector Highlighter contrib
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@792542 13f79535-47bb-0310-9956-ffa450edef68
2009-07-09 13:06:51 +00:00
Michael McCandless 333e77a431 LUCENE-1704: allow specifying the Tidy configuration file when parsing HTML docs with contrib/ant
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@791587 13f79535-47bb-0310-9956-ffa450edef68
2009-07-06 19:55:05 +00:00
Mark Robert Miller 28d65ceee7 remove java 1.5 dependency
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@791576 13f79535-47bb-0310-9956-ffa450edef68
2009-07-06 19:18:19 +00:00
Mark Robert Miller f780f77366 LUCENE-1730: Fix TrecContentSource to use ISO-8859-1 when reading the TREC files, unless a different encoding is specified. Additionally, ContentSource now supports a content.source.encoding parameter in the configuration file.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@791528 13f79535-47bb-0310-9956-ffa450edef68
2009-07-06 15:56:39 +00:00
Uwe Schindler 705f099238 Convert and cleanup the test files to UTF-8. What is still broken is the incorrect usage of KOI8 and CP1251 encodings. Added svn:eol-style=native to all files again.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@791483 13f79535-47bb-0310-9956-ffa450edef68
2009-07-06 13:50:17 +00:00
Mark Robert Miller e04abc52e7 LUCENE-1599: Add clone support for SpanQuerys. SpanRegexQuery counts on this functionality and does not work correctly without it.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@791280 13f79535-47bb-0310-9956-ffa450edef68
2009-07-05 17:16:16 +00:00
Mark Robert Miller 9789089343 reader should be closed after use
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@791246 13f79535-47bb-0310-9956-ffa450edef68
2009-07-05 14:01:14 +00:00
Uwe Schindler fed4bba63d LUCENE-1713: Rename RangeQuery -> TermRangeQuery (part 1)
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@791175 13f79535-47bb-0310-9956-ffa450edef68
2009-07-04 20:14:12 +00:00
Simon Willnauer 410afb98bf LUCENE-1719: Add javadoc notes about ICUCollationKeyFilter's advantages over CollationKeyFilter (Steven Row via Simon Willnauer)
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@790262 13f79535-47bb-0310-9956-ffa450edef68
2009-07-01 16:50:47 +00:00
Simon Willnauer 5265dc1bb2 LUCENE-1722: SmartChineseAnalyzer JavaDoc improvements - Replacing Chinese JavaDoc with English version. Robert Muir via Simon Willnauer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@790102 13f79535-47bb-0310-9956-ffa450edef68
2009-07-01 10:32:23 +00:00
Michael McCandless c7f865a4c7 LCUENE-1716: allow control over storage of norms (body norms), info stream and whether docs properties should be indexed as fields
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@788777 13f79535-47bb-0310-9956-ffa450edef68
2009-06-26 17:26:54 +00:00
Uwe Schindler 42dcc00374 Build an index.html on the top-level Javadocs folder (e.g. hudson will use it as entry point)
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@788039 13f79535-47bb-0310-9956-ffa450edef68
2009-06-24 14:34:09 +00:00
Michael McCandless 87de0c9688 LUCENE-1466: added chainable CharFilter stage before Tokenizer to allow mapping of characters before tokenization
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@787795 13f79535-47bb-0310-9956-ffa450edef68
2009-06-23 19:15:31 +00:00
Michael McCandless f03d77b558 LUCENE-1630: switch from Weight (interface) to QueryWeight (abstract class); mate in/out-of docID order scoring between Collector & Scorer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@787772 13f79535-47bb-0310-9956-ffa450edef68
2009-06-23 18:11:42 +00:00
Michael McCandless 5f72065d0f LUCENE-1714: fix WriteLineDocTask to also replace \r, \n (in addition to \t) with space so those chars don't create mal-formed lines
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@787750 13f79535-47bb-0310-9956-ffa450edef68
2009-06-23 16:46:17 +00:00
Michael McCandless ec8088654d bulk fix svn:eol-style to native for text files
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@787424 13f79535-47bb-0310-9956-ffa450edef68
2009-06-22 22:18:56 +00:00
Erik Hatcher 65131ca7b9 LUCENE-1405: Added support for Ant resource collections in contrib/ant <index> task.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@786610 13f79535-47bb-0310-9956-ffa450edef68
2009-06-19 18:24:19 +00:00
Michael McCandless 19234f12bd LUCENE-1692: add new contrib analyzer tests
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@786606 13f79535-47bb-0310-9956-ffa450edef68
2009-06-19 18:02:12 +00:00
Michael McCandless 2f2cd20828 LUCENE-1692: add tests for Thai & SmartChinese analyzers; fix wrong endOffset bug in ThaiWordFilter; use stop words by default with SmartChineseAnalyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@786560 13f79535-47bb-0310-9956-ffa450edef68
2009-06-19 15:52:36 +00:00
Uwe Schindler 0b5cbca110 LUCENE-1673: Move TrieRange to core (part 2: removing from contrib/queries)
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@786474 13f79535-47bb-0310-9956-ffa450edef68
2009-06-19 12:16:52 +00:00
Uwe Schindler 7b34ab8f30 LUCENE-1673: Move TrieRange to core (part 1: addition to core)
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@786470 13f79535-47bb-0310-9956-ffa450edef68
2009-06-19 12:09:52 +00:00
Mark Robert Miller d7d455246f LUCENE-1595: Separate DocMaker into DocMaker and ContentSource.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@786233 13f79535-47bb-0310-9956-ffa450edef68
2009-06-18 19:58:59 +00:00
Michael McCandless 835c405be0 LUCENE-973: add test case for CJKAnalyzer; fix trailing empty string bug
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@785287 13f79535-47bb-0310-9956-ffa450edef68
2009-06-16 16:38:39 +00:00
Michael Busch f2a5f395d8 Fix pom.xml.template of remote contrib to have the correct artifactId
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@784674 13f79535-47bb-0310-9956-ffa450edef68
2009-06-15 07:33:57 +00:00