Commit Graph

361 Commits

Author SHA1 Message Date
Doron Cohen ece8361ab5 LUCENE-749: ChainedFilter behavior fixed when logic of first filter is ANDNOT.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607606 13f79535-47bb-0310-9956-ffa450edef68
2007-12-30 22:47:59 +00:00
Grant Ingersoll bd340a896d git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607291 13f79535-47bb-0310-9956-ffa450edef68 2007-12-28 17:08:26 +00:00
Grant Ingersoll 5c81934465 git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607290 13f79535-47bb-0310-9956-ffa450edef68 2007-12-28 17:08:16 +00:00
Grant Ingersoll cb94c6aed4 git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607288 13f79535-47bb-0310-9956-ffa450edef68 2007-12-28 17:07:33 +00:00
Grant Ingersoll 40d85a7781 Switch to using the EnwikiDocMaker
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@607281 13f79535-47bb-0310-9956-ffa450edef68
2007-12-28 16:29:03 +00:00
Grant Ingersoll ca821526b0 removed bad chars at end of file
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@605966 13f79535-47bb-0310-9956-ffa450edef68
2007-12-20 15:33:16 +00:00
Grant Ingersoll 36b1206ad8 Restoring ExtractWikipedia, as it is still a handy class to have around. Splitting the documents is useful for debugging purposes when you know you want to look at a specific document instead of grepping through a really large file.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@605959 13f79535-47bb-0310-9956-ffa450edef68
2007-12-20 15:14:24 +00:00
Grant Ingersoll 55d0c3a2f8 LUCENE-1077: refactored to have a common PayloadHelper classes. Also added TokenOffsetPayloadTokenFilter, which encodes the Token offset into the payloads
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@604870 13f79535-47bb-0310-9956-ffa450edef68
2007-12-17 13:55:46 +00:00
Doron Cohen b7e167ac8d LUCENE-1086: DocMakers setup for the "docs.dir" property fails when passing an absolute path.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@603856 13f79535-47bb-0310-9956-ffa450edef68
2007-12-13 08:58:52 +00:00
Doron Cohen 73f9e7ebc0 fix potential thread-safety issue in contrib/benchmark's TrecDocMaker.
(follow-up to http://svn.apache.org/viewvc?view=rev&revision=602475)


git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@603846 13f79535-47bb-0310-9956-ffa450edef68
2007-12-13 07:26:58 +00:00
Michael McCandless 86ca6f86d7 fix intermittent thread-safety failure in contrib/benchmark unit test
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@602475 13f79535-47bb-0310-9956-ffa450edef68
2007-12-08 14:17:07 +00:00
Michael McCandless b0d2b1c90e LUCENE-1044: revert the doSync option to FSDirectory
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@602165 13f79535-47bb-0310-9956-ffa450edef68
2007-12-07 17:42:33 +00:00
Grant Ingersoll f9b2e971f2 LUCENE-1077 new sinks and payloads analysis packages
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@602081 13f79535-47bb-0310-9956-ffa450edef68
2007-12-07 12:21:49 +00:00
Michael McCandless 6be2c0765c LUCENE-1044: also re-default doSync back to false in contrib/benchmark
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@596874 13f79535-47bb-0310-9956-ffa450edef68
2007-11-20 23:17:44 +00:00
Michael Busch b04703fe8f LUCENE-1055: Remove gdata from trunk.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@596501 13f79535-47bb-0310-9956-ffa450edef68
2007-11-20 00:46:27 +00:00
Michael Busch 1abb04580f Disable verbose standard output in MemoryIndexTest by default.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@596000 13f79535-47bb-0310-9956-ffa450edef68
2007-11-17 20:19:17 +00:00
Mark Harwood 04ae927f38 Added toString implementation on BooleanFilter.java, provided by Jason Calabrese
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@595996 13f79535-47bb-0310-9956-ffa450edef68
2007-11-17 20:08:06 +00:00
Michael Busch bb37d2bcff LUCENE-1051: Generate separate javadocs for core, demo and contrib classes, as well as an unified view.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@595013 13f79535-47bb-0310-9956-ffa450edef68
2007-11-14 19:16:19 +00:00
Michael McCandless 439ba586fc LUCENE-1044: add doSync option to FSDirectory.getDirectory, defaulting to true, to sync() each file descriptor before close()
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@593799 13f79535-47bb-0310-9956-ffa450edef68
2007-11-10 17:51:00 +00:00
Daniel Naber 2f5507bfc9 fix returning unbalanced quotes in describeParams()
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@586918 13f79535-47bb-0310-9956-ffa450edef68
2007-10-21 17:26:16 +00:00
Grant Ingersoll a614f0d99a Added some more algorithms for testing things out, implemented basic TREC query driver based on the sample in the javadocs.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@585677 13f79535-47bb-0310-9956-ffa450edef68
2007-10-17 20:36:20 +00:00
Grant Ingersoll b7253a06b7 LUCENE-1027: Added better formatting of doubles, added wikipedia-flush-by-RAM for comparison
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@583792 13f79535-47bb-0310-9956-ffa450edef68
2007-10-11 12:10:31 +00:00
Grant Ingersoll 9c9ebe5cf4 LUCENE-1027: Added support for doubles to Config, also added copies of standard and micro-standard algorithms that flush by RAM
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@583771 13f79535-47bb-0310-9956-ffa450edef68
2007-10-11 11:05:40 +00:00
Mark Harwood 21a07ee41e Provided DTDs for core and contrib XML query syntax. The "docs" directory contains detailed documentation generated by DTDdoc from the DTDs. The ant script used to generate these docs is also included but not hooked up to the main build process due to license issues with DTDdoc.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@583307 13f79535-47bb-0310-9956-ffa450edef68
2007-10-09 21:45:27 +00:00
Mark Harwood 3872d3bfcc Updated hashcode/equals to test all fields
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@583305 13f79535-47bb-0310-9956-ffa450edef68
2007-10-09 21:40:45 +00:00
Chris M. Hostetter 243861715b cleaning up a ton of javadoc warnings from gdata. most of these fixes related to either: clarifying packages for @link tags; changing @link or @see tags that pointed at classes/methods that didn't exist (by picking classes with very similar names that do exist); or removing incomplete stub javadocs (that added no information beyond the signature
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@582380 13f79535-47bb-0310-9956-ffa450edef68
2007-10-05 20:30:59 +00:00
Mark Harwood 62fa7b4b82 Added new DuplicateFilter functionality to filter documents sharing a field value (e.g. primary key/url)
Also includes Junit test and XML Query support

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@581426 13f79535-47bb-0310-9956-ffa450edef68
2007-10-02 22:56:46 +00:00
Grant Ingersoll dce47c6401 LUCENE-1005, apply GMT timeZone to the data formatter so it outputs properly formatted dates
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@579766 13f79535-47bb-0310-9956-ffa450edef68
2007-09-26 19:15:26 +00:00
Grant Ingersoll bcfad28d69 LUCENE-1005, apply GMT timeZone to the data formatter so it outputs properly formatted dates
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@579765 13f79535-47bb-0310-9956-ffa450edef68
2007-09-26 19:12:59 +00:00
Michael McCandless a28eb4d978 LUCENE-994: change defaults in IndexWriter to maximize 'out of the box' indexing speed
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@579360 13f79535-47bb-0310-9956-ffa450edef68
2007-09-25 20:02:07 +00:00
Michael McCandless 511406ecbe remove temporary print for GData unit test
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@577892 13f79535-47bb-0310-9956-ffa450edef68
2007-09-20 19:41:42 +00:00
Chris M. Hostetter 3f517bff75 don't just write date to stdout, include date string in failure message
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@577873 13f79535-47bb-0310-9956-ffa450edef68
2007-09-20 18:52:42 +00:00
Michael McCandless fada31fa7f adding temporary print to figure out why this gdata-server test is failing on build machine
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@577861 13f79535-47bb-0310-9956-ffa450edef68
2007-09-20 18:38:23 +00:00
Doron Cohen 40f0adb507 LUCENE-941: (leftover - add info in benchmark/CHANGES.txt entry)
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@576790 13f79535-47bb-0310-9956-ffa450edef68
2007-09-18 09:13:15 +00:00
Doron Cohen 9e51c30349 LUCENE-941: benchmark: infinite loop for alg: {[AddDoc(4000)]: 4} : *
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@576786 13f79535-47bb-0310-9956-ffa450edef68
2007-09-18 09:05:06 +00:00
Michael Busch 9c2a036db3 - LUCENE-908: Improvements and simplifications for how the MANIFEST file and the META-INF dir are created.
- LUCENE-935: Various improvements for the maven artifacts. Now the artifacts also include the sources as .jar files. 

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@568766 13f79535-47bb-0310-9956-ffa450edef68
2007-08-22 23:16:48 +00:00
Grant Ingersoll c67fd79a83 LUCENE-981 and LUCENE-980: Added new AnalyzerTask and fixed issue with long strings in Format.java
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@567262 13f79535-47bb-0310-9956-ffa450edef68
2007-08-18 12:24:21 +00:00
Grant Ingersoll d1f90c7825 Deprecated all the old benchmarking stuff
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@566454 13f79535-47bb-0310-9956-ffa450edef68
2007-08-16 00:49:32 +00:00
Grant Ingersoll 9192b16643 Deprecated all the old benchmarking stuff
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@566435 13f79535-47bb-0310-9956-ffa450edef68
2007-08-16 00:23:06 +00:00
Grant Ingersoll 477c4e0efe Deprecated all the old benchmarking stuff
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@566433 13f79535-47bb-0310-9956-ffa450edef68
2007-08-16 00:22:46 +00:00
Michael McCandless d42de32984 LUCENE-969: deprecate Token.termText() & optimize core tokenizers by re-using tokens & TokenStreams
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@564715 13f79535-47bb-0310-9956-ffa450edef68
2007-08-10 18:34:33 +00:00
Grant Ingersoll 82eb074afd LUCENE-974: Removed Author tags from all existing code
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@564236 13f79535-47bb-0310-9956-ffa450edef68
2007-08-09 15:21:19 +00:00
Michael McCandless d1422ebd6b LUCENE-971: extract wikipedia documents as a doc maker directly from XML file without using intermediate one-file-per-document
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@564151 13f79535-47bb-0310-9956-ffa450edef68
2007-08-09 08:57:26 +00:00
Michael McCandless 2d954694dc LUCENE-966: sizable (~6X faster) speedups to StandardTokenizer by using JFlex instead of JavaCC
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@564036 13f79535-47bb-0310-9956-ffa450edef68
2007-08-08 22:26:44 +00:00
Michael McCandless 0fd867732e LUCENE-967: add ReadTokensTask to allow for benchmarking just tokenization
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@561908 13f79535-47bb-0310-9956-ffa450edef68
2007-08-01 18:54:43 +00:00
Doron Cohen f3b9c9407a for LUCENE-836 sort reuters files by name (otherwise TestQualityRun can fail on some OSs).
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@560428 13f79535-47bb-0310-9956-ffa450edef68
2007-07-27 23:56:48 +00:00
Doron Cohen 98fa2d898d LUCENE-836: Add support for search quality benchmarking.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@560372 13f79535-47bb-0310-9956-ffa450edef68
2007-07-27 20:24:52 +00:00
Michael McCandless 02dd452026 LUCENE-947: add creation of & indexing from 'one document per line' text files to minimize IO overhead of creating documents when running tests
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@559366 13f79535-47bb-0310-9956-ffa450edef68
2007-07-25 08:54:58 +00:00
Grant Ingersoll e97d5830ce LUCENE-868: New Term Vector access mechanism. Allows for applications to define how they access term vector information instead of having to pack/unpack the TV info returned by the old way.
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@558592 13f79535-47bb-0310-9956-ffa450edef68
2007-07-23 03:17:25 +00:00
Michael McCandless 96ea45d193 LUCENE-952: force synchronized access to writer instance variable to fix infinite spin loop in TestGdataIndexer
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@554189 13f79535-47bb-0310-9956-ffa450edef68
2007-07-07 12:28:04 +00:00