Commit Graph

40 Commits

Author SHA1 Message Date
Robert Muir d2af6ef0bd LUCENE-1794: Implement TokenStream reuse for contrib Analyzers
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@804680 13f79535-47bb-0310-9956-ffa450edef68
2009-08-16 12:37:05 +00:00
Robert Muir 43a5bd6c19 LUCENE-1628: Add Persian Analyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@802955 13f79535-47bb-0310-9956-ffa450edef68
2009-08-10 23:29:27 +00:00
Robert Muir 820620f3a7 LUCENE-1758: Update ArabicAnalyzer to light10 stemming, stopwords improvements, lowercase non-arabic text
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@801348 13f79535-47bb-0310-9956-ffa450edef68
2009-08-05 18:22:22 +00:00
Mark Robert Miller f0e54e31e6 LUCENE-1406 belongs in contrib CHANGES
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@800832 13f79535-47bb-0310-9956-ffa450edef68
2009-08-04 15:05:34 +00:00
Mark Robert Miller b44ed588ac LUCENE-1685 should be in API changes, not new features
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@800821 13f79535-47bb-0310-9956-ffa450edef68
2009-08-04 14:33:58 +00:00
Mark Robert Miller 10b41d2dce LUCENE-1685: The position aware SpanScorer has become the default scorer for Highlighting. The SpanScorer implementation has replaced QueryScorer and the old term highlighting QueryScorer has been renamed to QueryTermScorer. Multi-term queries are also now expanded by default. If you were previously rewritting the query for multi-term query highlighting, you should no longer do that (unless you switch to using QueryTermScorer). The SpanScorer API (now QueryScorer) has also been improved to more closely match the API of the previous QueryScorer implementation.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@800796 13f79535-47bb-0310-9956-ffa450edef68
2009-08-04 13:56:11 +00:00
Mark Robert Miller 4054b4ebf3 move the web based xml demo from core changes to contrib changes - also fixes skipping # 34 in features
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@800661 13f79535-47bb-0310-9956-ffa450edef68
2009-08-04 02:57:00 +00:00
Michael Busch 457c29d31e LUCENE-1775: Change remaining contrib TokenFilters (shingle, prefix-suffix) to use the new TokenStream API.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@800195 13f79535-47bb-0310-9956-ffa450edef68
2009-08-03 04:33:10 +00:00
Mark Robert Miller 5aaf5b0167 LUCENE-1486: Move ComplexPhraseQueryParser to contrib
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@800193 13f79535-47bb-0310-9956-ffa450edef68
2009-08-03 04:06:22 +00:00
Michael Busch 343992fcbb LUCENE-1567: New flexible QueryParser framework.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@800191 13f79535-47bb-0310-9956-ffa450edef68
2009-08-03 03:38:44 +00:00
Michael McCandless bbcab117d9 LUCENE-1683: fixed JavaUtilRegexCapabilities (used by RegexQuery) to match entire string not just prefix
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@799678 13f79535-47bb-0310-9956-ffa450edef68
2009-07-31 18:02:56 +00:00
Michael McCandless 0b0d13dffe LUCENE-1745: allow passing matching flags to the underlying regexp engine
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@799667 13f79535-47bb-0310-9956-ffa450edef68
2009-07-31 17:41:04 +00:00
Mark Robert Miller f73a4f4324 LUCENE-1695: Update the Highlighter to use the new TokenStream API. This issue breaks backwards compatibility with some public classes. If you have implemented custom Fregmenters or Scorers, you will need to adjust them to work with the new TokenStream API. Rather than getting passed a Token at a time, you will be given a TokenStream to init your impl with - store the Attributes you are interested in locally and access them on each call to the method that used to pass a new Token. Look at the included updated impls for examples.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@799455 13f79535-47bb-0310-9956-ffa450edef68
2009-07-30 22:00:47 +00:00
Mark Robert Miller afb517e832 LUCENE-1752: Missing highlights when terms were repeated in separate, nested, boolean or disjunction queries.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@798976 13f79535-47bb-0310-9956-ffa450edef68
2009-07-29 16:47:05 +00:00
Michael McCandless c79f54975e LUCENE-1505: switch local lucene to use trie's NumericUtils for mapping doubles to strings
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@794721 13f79535-47bb-0310-9956-ffa450edef68
2009-07-16 15:38:06 +00:00
Otis Gospodnetic b393e4d0af LUCENE-1491 - EdgeNGramTokenFilter no longer stops on tokens shorter than minimum n-gram size.
- line, and those below, will be ignored--

M    CHANGES.txt
M    analyzers/src/test/org/apache/lucene/analysis/ngram/EdgeNGramTokenFilterTest.java
M    analyzers/src/test/org/apache/lucene/analysis/ngram/NGramTokenFilterTest.java
M    analyzers/src/java/org/apache/lucene/analysis/ngram/EdgeNGramTokenFilter.java
M    analyzers/src/java/org/apache/lucene/analysis/ngram/NGramTokenFilter.java


git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@794034 13f79535-47bb-0310-9956-ffa450edef68
2009-07-14 19:44:52 +00:00
Michael McCandless 65494af827 LUCENE-1272: add MoreLikeThis.set/getBoost
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@793973 13f79535-47bb-0310-9956-ffa450edef68
2009-07-14 16:56:16 +00:00
Michael McCandless 91aedd6685 LUCENE-1740: add 'analyzer' command to Lucli, to change analyzer from the default StandardAnalyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@793526 13f79535-47bb-0310-9956-ffa450edef68
2009-07-13 10:06:01 +00:00
Michael McCandless 9cbe5f4ff4 LUCENE-1522: adding new Fast Vector Highlighter contrib
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@792542 13f79535-47bb-0310-9956-ffa450edef68
2009-07-09 13:06:51 +00:00
Michael McCandless 333e77a431 LUCENE-1704: allow specifying the Tidy configuration file when parsing HTML docs with contrib/ant
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@791587 13f79535-47bb-0310-9956-ffa450edef68
2009-07-06 19:55:05 +00:00
Uwe Schindler 0b5cbca110 LUCENE-1673: Move TrieRange to core (part 2: removing from contrib/queries)
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@786474 13f79535-47bb-0310-9956-ffa450edef68
2009-06-19 12:16:52 +00:00
Karl-Johan Wettin 196428ec39 LUCENE-1578: Support for loading unoptimized readers to the constructor of InstantiatedIndex. (Karl Wettin)
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@784481 13f79535-47bb-0310-9956-ffa450edef68
2009-06-13 21:54:07 +00:00
Grant Ingersoll 1511ec5e31 LUCENE-1676: in-stream payload support
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@784297 13f79535-47bb-0310-9956-ffa450edef68
2009-06-12 22:26:01 +00:00
Michael McCandless 2dd7d33e86 LUCENE-1643: use reusable RawCollationKey for better performance
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@776252 13f79535-47bb-0310-9956-ffa450edef68
2009-05-19 09:50:24 +00:00
Michael McCandless be0a47b7e3 LUCENE-1629: move CHANGES entry to contrib; add TestArabicAnalyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@774727 13f79535-47bb-0310-9956-ffa450edef68
2009-05-14 10:50:52 +00:00
Michael McCandless 8c4fff6e21 LUCENE-1591: add bzip2 compression/decompress to contrib/benchmark
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@765543 13f79535-47bb-0310-9956-ffa450edef68
2009-04-16 09:46:30 +00:00
Michael McCandless c73712d1bb LUCENE-1576: fix BrazilianAnalyzer to downcase before filtering stop words
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@759307 13f79535-47bb-0310-9956-ffa450edef68
2009-03-27 19:04:25 +00:00
Michael McCandless 6bf0e6e09b LUCENE-1435: add contrib/collation (CollationKeyFilter), to convert tokens into indexable CollationKeys
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@755914 13f79535-47bb-0310-9956-ffa450edef68
2009-03-19 10:51:55 +00:00
Michael McCandless e44e6b0603 LUCENE-1490: forgot CHANGES.txt update
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@755746 13f79535-47bb-0310-9956-ffa450edef68
2009-03-18 21:42:17 +00:00
Michael McCandless 6248e14515 LUCENE-1548: fix distance normalization in LevenshteinDistance to not produce negative distances
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@748534 13f79535-47bb-0310-9956-ffa450edef68
2009-02-27 14:07:12 +00:00
Karl-Johan Wettin 6e692d38ec LUCENE-1531
Added support for BoostingTermQuery to XML query parser.

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@742411 13f79535-47bb-0310-9956-ffa450edef68
2009-02-09 11:49:33 +00:00
Karl-Johan Wettin d7376608b2 LUCENE-1514
ShingleMatrixFilter#next(Token) easily throws a StackOverflowException due to recursive invocation. (Karl Wettin)


git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@733064 13f79535-47bb-0310-9956-ffa450edef68
2009-01-09 15:34:52 +00:00
Karl-Johan Wettin f991524da8 LUCENE-1510
InstantiatedIndexReader#norms methods throws NullPointerException on empty index.

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@732661 13f79535-47bb-0310-9956-ffa450edef68
2009-01-08 09:28:42 +00:00
Karl-Johan Wettin 219a20a945 LUCENE-1462
InstantiatedIndexWriter did not reset pre analyzed TokenStreams the same way IndexWriter does. 
Parts of InstantiatedIndex was not Serializable.



git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@725837 13f79535-47bb-0310-9956-ffa450edef68
2008-12-11 22:08:45 +00:00
Michael McCandless 481f8080ab LUCENE-1470: add TrieRangeQuery, a much more efficient implementation of RangeQuery at the expense of added space consumed in the index
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@723031 13f79535-47bb-0310-9956-ffa450edef68
2008-12-03 19:38:31 +00:00
Karl-Johan Wettin 456b10fdf9 LUCENE-1423
InstantiatedTermEnum#skipTo(Term) throws ArrayIndexOutOfBoundsException on an empty index.

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@705893 13f79535-47bb-0310-9956-ffa450edef68
2008-10-18 16:29:53 +00:00
Michael McCandless 98e1129a14 break off contrib/CHANGES.txt's 2.4.0 release section
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@700743 13f79535-47bb-0310-9956-ffa450edef68
2008-10-01 11:22:58 +00:00
Karl-Johan Wettin 82c70c018e LUCENE-1016 : TermVectorAccessor, transparent vector space access via stored vectors or by resolving the inverted index.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@688745 13f79535-47bb-0310-9956-ffa450edef68
2008-08-25 15:02:20 +00:00
Karl-Johan Wettin 3034575f66 LUCENE-1142 : Updated Snowball package, org.tartarus distribution revision 500.
Introducing Hungarian, Turkish and Romanian support, updated older stemmers and optimized (reflectionless) SnowballFilter.

IMPORTANT NOTICE ON BACKWARDS COMPATIBILITY: an index created using the 2.3.2 (or older) might not be compatible with these updated classes as some algorithms have changed.



git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@688420 13f79535-47bb-0310-9956-ffa450edef68
2008-08-23 22:02:47 +00:00
Karl-Johan Wettin 9fe7a35378 Contrib level CHANGES.txt. I forgot to add this some time ago.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@688370 13f79535-47bb-0310-9956-ffa450edef68
2008-08-23 17:12:57 +00:00