lucene

Commit Graph

Author	SHA1	Message	Date
Julie Tibshirani	fb09ae1f7c	Undo accidental change to build.gradle	2022-01-23 16:26:16 -08:00
Julie Tibshirani	7ece8145bc	LUCENE-10375: Write vectors to file in flush (#617 ) In a previous commit, we updated HNSW merge to first write the combined segment vectors to a file, then use that file to build the graph. This commit applies the same strategy to flush, which lets us use the same logic for flush and merge.	2022-01-23 16:19:23 -08:00
Dawid Weiss	08d6633d94	LUCENE-8930: increase timeout for the launched luke.	2022-01-20 16:51:05 +01:00
Ignacio Vera	4ec8f865c8	LUCENE-10288: Check BKD tree shape for lucene pre-8.6 1D indexes (#607 ) Adds efficient logic to compute if a tree is balanced or unbalanced for indexes created before Lucene 8.6	2022-01-20 07:49:29 +01:00
Dawid Weiss	72ba7ae2ee	LUCENE-8930: script testing in the distribution (#550 )	2022-01-20 00:09:15 +09:00
Julie Tibshirani	9b6d417d1c	LUCENE-10040: Update HnswGraph javadoc related to deletions Previously it claimed the search method did not handle deletions.	2022-01-18 15:36:00 -08:00
Julie Tibshirani	dfca9a5608	LUCENE-10375: Write merged vectors to file before building graph (#601 ) When merging segments together, the `KnnVectorsWriter` creates a `VectorValues` instance with a merged view of all the segments' vectors. This merged instance is used when constructing the new HNSW graph. Graph building needs random access, and the merged VectorValues support this by mapping from merged ordinals to segments and segment ordinals. This mapping can add significant overhead when building the graph. This change updates the HNSW merging logic to first write the combined segment vectors to a file, then use that the file to build the graph. This helps speed up segment merging, and also lets us simplify `VectorValuesMerger`, which provides the merged view of vector values.	2022-01-18 13:53:05 -08:00
Alan Woodward	2e2c4818d1	LUCENE-10377: Replace 'sortPos' with 'enableSkipping' in SortField.getComparator() (#603 ) The sort position parameter in SortField.getComparator() is only ever used to determine whether or not skipping should be enabled on a given comparator, so the parameter name should reflect that. This commit also explicitly disables skipping in a number of cases where it is never used, in particular CheckIndex and the grouping collectors.	2022-01-17 10:44:57 +00:00
Adrien Grand	457367e9b7	LUCENE-10168: Fix typo that would _not_ run nightly tests.	2022-01-14 13:51:16 +01:00
Greg Miller	2f5e3c323b	LUCENE-10379: Count directly into the dense values array in FastTaxonomyFacetCounts#countAll (#605 ) Co-authored-by: guofeng.my <guofeng.my@bytedance.com>	2022-01-13 09:17:55 -08:00
Mayya Sharipova	bd2cc4124d	Small edits for KnnGraphTester (#575 ) 1. Correct the remaining size for input files larger than Integer.MAX_VALUE, as currently with every iteration we try to map the next blockSize of bytes even if less < blockSize bytes are left in the file. 2. Correct java.lang.ClassCastException when retrieving KnnGraphValues for stats printing. 3. Add an option for euclidean metric	2022-01-12 17:23:10 -05:00
gf2121	8d9fa6dba1	revert LUCENE-10355 (#597 ) Trying to find the source of taxo-facet performance regression. See also LUCENE-10374 Co-authored-by: guofeng.my <guofeng.my@bytedance.com>	2022-01-12 10:23:13 -08:00
Adrien Grand	71dfa9e9cd	addBackcompatIndexes.py should use Gradle, not Ant. (#531 )	2022-01-12 18:55:59 +01:00
Uwe Schindler	636d42e032	Fix wrong project name	2022-01-11 17:42:21 +01:00
Nikola Grcevski	bad65c53c9	LUCENE-10369: Move DelegatingCacheHelper to FilterDirectoryReader (#596 )	2022-01-11 15:22:06 +01:00
Adrien Grand	308ddd7502	Add documentation on file formats. (#598 )	2022-01-11 15:16:05 +01:00
Adrien Grand	f81c760cc8	LUCENE-10370: Fix precommit.	2022-01-11 10:13:10 +01:00
Dawid Weiss	9b54fbaa01	LUCENE-10370: temporarily ignore TestStressNRTReplication	2022-01-11 09:25:31 +01:00
Greg Miller	82703757fe	LUCENE-10245: Addition of MultiDoubleValues(Source) and MultiLongValues(Source) along with faceting capabilities (#543 )	2022-01-10 13:48:36 -08:00
Dawid Weiss	bff930c1bf	LUCENE-10370: temporarily ignore TestNRTReplication.	2022-01-10 22:18:12 +01:00
Greg Miller	cf12b46092	LUCENE-10356: Further optimize facet counting for single-valued TaxonomyFacetCounts (#585 )	2022-01-10 10:23:46 -08:00
Greg Miller	eb0b1bf9f1	Add CHANGES entry for LUCENE-10250	2022-01-10 08:57:28 -08:00
Marc D'mello	b4e27f2c63	LUCENE-10250: Add support for arbitrary length hierarchical SSDV facets (#509 )	2022-01-10 08:52:14 -08:00
gf2121	e750f6cd37	LUCENE-10350: Avoid some null checking for FastTaxonomyFacetCounts#countAll() (#578 )	2022-01-10 07:43:09 -08:00
Adrien Grand	2ebc57a465	LUCENE-10283: Bump minimum required Java version to 17. (#579 ) Co-authored-by: Dawid Weiss <dawid.weiss@carrotsearch.com>	2022-01-10 15:42:15 +01:00
Adrien Grand	74698994a9	Simplify some exception handling with try-with-resources. (#589 )	2022-01-10 15:40:47 +01:00
Yannick Welsch	d9d65ab849	LUCENE-10291: Don't use CFS in testMinimalCodec (#593 ) This test was occasionally failing on CI, as the test randomly installed a merge policy that would force compound file creation while the goal of the test was not to do so.	2022-01-10 12:17:45 +00:00
Uwe Schindler	42fe2d5620	LUCENE-10364: Prepare and update errorprone plugin for Java 17 (#590 )	2022-01-07 19:19:46 +01:00
zacharymorn	d0ad9f5bfc	LUCENE-10183: KnnVectorsWriter#writeField to take KnnVectorsReader instead of VectorValues (#534 )	2022-01-06 22:14:41 -08:00
Robert Muir	f2e00bb9e0	LUCENE-10353: add random null injection to TestRandomChains (#586 ) Co-authored-by: Uwe Schindler <uschindler@apache.org>, Robert Muir <rmuir@apache.org>	2022-01-06 16:56:49 +01:00
Adrien Grand	603a43f668	Fix path of docs for import into the website. (#524 ) The current `svn import` looks for docs where they used to be produced by the `Ant` build, but `Gradle` now puts them in a different place.	2022-01-06 09:26:45 +01:00
Dawid Weiss	b8da9f32c8	LUCENE-10328: open up certain packages for junit and the test framework (reflective access).	2022-01-05 21:02:51 +01:00
Dawid Weiss	ff547e7bbd	LUCENE-10328: Module path for compiling and running tests is wrong (#571 )	2022-01-05 20:42:02 +01:00
Adrien Grand	c8651afde7	LUCENE-10354: Clarify contract of codec APIs with missing/disabled fields. (#583 )	2022-01-05 18:47:35 +01:00
Adrien Grand	7fdba36941	LUCENE-10291: Bug fix.	2022-01-05 16:37:37 +01:00
Adrien Grand	f9ff620ec6	LUCENE-10291: CHANGES entry	2022-01-05 16:30:58 +01:00
Yannick Welsch	8fa7412dec	LUCENE-10291: Only read/write postings when there is at least one indexed field (#539 )	2022-01-05 16:28:00 +01:00
Adrien Grand	65296e5f84	Use CDN to download source release. (#529 )	2022-01-05 15:54:33 +01:00
Adrien Grand	6149387f7c	Modernize release announcement text. (#525 ) It currently reads as Lucene is a full-text search library when it can do much more than that nowadays.	2022-01-05 15:53:49 +01:00
Uwe Schindler	475fbd0bdd	LUCENE-10352: Convert TestAllAnalyzersHaveFactories and TestRandomChains to a global integration test and discover classes to check from module system (#582 ) Co-authored-by: Robert Muir <rmuir@apache.org>	2022-01-05 15:35:02 +01:00
gf2121	238119224a	LUCENE-10343: Remove MyRandom in favor of test framework random (#573 )	2022-01-05 15:31:00 +01:00
gf2121	60b80017cb	LUCENE-10355: clean zeros (#584 )	2022-01-05 15:23:16 +01:00
Mayya Sharipova	78da703037	LUCENE-10351 Correct knn search failure with deleted docs (#580 ) Current when doing knn search on an segment where all documents with knn field were deleted, we get the following error: maxSize must be > 0 and < 2147483630; got: 0 java.lang.IllegalArgumentException: maxSize must be > 0 and < 2147483630; got: 0 at __randomizedtesting.SeedInfo.seed([43F1F124D7076A4E:1B860BFCCB9B0BB5]:0) at org.apache.lucene.util.LongHeap.<init>(LongHeap.java:57) at org.apache.lucene.util.LongHeap$1.<init>(LongHeap.java:69) at org.apache.lucene.util.LongHeap.create(LongHeap.java:69) at org.apache.lucene.util.hnsw.NeighborQueue.<init>(NeighborQueue.java:41) at org.apache.lucene.util.hnsw.HnswGraph.search(HnswGraph.java:105)# This patch fixes this error and ensures empty TopDocs are returned when knn field doesn't have any documents left.	2022-01-04 15:59:30 -05:00
Uwe Schindler	4bacf93c7e	LUCENE-10348: Make stopwords resources from analyzers modules visible to ClasspathResourceLoader and ModuleResourceLoader (#581 )	2022-01-04 15:05:29 +01:00
Christine Poerschke	ef1a554204	Update copyright year in NOTICE.txt file.	2022-01-04 10:43:46 +00:00
Dawid Weiss	0f0d06ca28	LUCENE-10347: add a helper task 'collectRuntimeJars' that assembles binary artifacts under each module's build 'runtimeJars' folder. (#576 )	2022-01-03 21:11:35 +01:00
Adrien Grand	cc5634f0f1	Remove unused backward indices.	2022-01-03 15:17:47 +01:00
Uwe Schindler	305d9ebb86	LUCENE-10349: Cleanup WordListLoader to use try-with-resources and make the default stop words unmodifiable (#577 )	2022-01-03 15:07:44 +01:00
Adrien Grand	835e821287	LUCENE-10346: Move CHANGES entry to 9.1.	2022-01-03 15:04:24 +01:00
Uwe Schindler	8b5887f244	LUCENE-10287: Remove obsolete changes entry (we now have a warning and won't rely on the module when staring luke)	2022-01-03 14:59:12 +01:00

1 2 3 4 5 ...

35684 Commits All Branches Search

35684 Commits

All Branches