lucene

Commit Graph

Author	SHA1	Message	Date
Alexander Münch	5de685cfba	Removed duplicate check in SpanGradientFormatter (#11762 )	2022-09-14 13:37:31 +01:00
Adrien Grand	a426c6fec3	Fix integer overflow in tests.	2022-09-13 17:08:17 +02:00
Greg Miller	4463a0b271	GITHUB#11742: MatchingFacetSetsCounts#getTopChildren now returns top children instead of all children (#11764 )	2022-09-13 06:50:52 -07:00
Dawid Weiss	e491ef797c	Retry gradle wrapper download on http 500 and 503. (#11766 )	2022-09-13 10:30:20 +02:00
Dhiru Kholia	30b72ec364	Fix a typo affecting Luke (#11763 )	2022-09-12 13:05:40 +02:00
Alan Woodward	41d03f69ce	Fix IntervalBuilder.NO_INTERVALS docId when unpositioned (#11760 ) IntervalBuilder.NO_INTERVALS should return -1 when unpositioned, not NO_MORE_DOCS. This can trigger exceptions when an empty IntervalQuery is combined in a conjunction. Fixes #11759	2022-09-09 17:19:15 +01:00
Mayya Sharipova	0ea8035612	LUCENE-10592 Better estimate memory for HNSW graph (#11743 ) Better estimate memory used for OnHeapHnswGraph, as well as add tests. Also don't overallocate arrays in NeighborArray Relates to #992	2022-09-08 16:54:29 -04:00
Yuting Gan	49b596ef02	Added a top-n range faceting example (#1035 )	2022-09-08 12:19:42 -07:00
Julie Tibshirani	09a13aeaf2	LUCENE-10577: Remove LeafReader#searchNearestVectorsExhaustively (#11756 ) This PR removes the recently added function on LeafReader to exhaustively search through vectors, plus the helper function KnnVectorsReader#searchExhaustively. Instead it performs the exact search within KnnVectorQuery, using a new helper class called VectorScorer.	2022-09-08 12:15:02 -07:00
Robert Muir	f4146a44e9	Fix TestIndexWriterOnDiskFull.testAddDocumentOnDiskFull to handle IllegalStateException from startCommit() (#11757 ) If ConcurrentMergeScheduler is used, and the merge hits fatal exception (such as disk full) after prepareCommit()'s ensureOpen() check, then startCommit() will throw IllegalStateException instead of AlreadyClosedException. The test is currently not prepared to handle this: the logic is only geared around exceptions coming from addDocument() Closes #11755	2022-09-08 13:35:54 -04:00
Adrien Grand	f8285fd0fe	Prevent term vectors from exceeding the maximum dictionary size. (#11726 ) When indexing term vectors for a very large document, the automatic computation of the dictionary size based on the overall size of the block might yield a size that exceeds the maximum window size that is supported by LZ4. This commit addresses the issue by automatically taking the minimum of the result of this computation and the maximum window size (64kB).	2022-09-08 13:44:21 +02:00
Marios Trivyzas	dbffe3472b	LUCENE-10423: Remove usages of System.currentTimeMillis() from tests (#11749 ) * Remove usages of System.currentTimeMillis() from tests - Use Random from `RandomizedRunner` to be able to use a Seed to reproduce tests, instead of a seed coming from wall clock. - Replace time based tests, using wall clock to determine periods with counter of repetitions, to have a consistent reproduction. Closes: #11459 * address comments * tune iterations * tune iterations for nightly	2022-09-06 17:55:01 -04:00
Dawid Weiss	d3460fa1bb	Add tidy after addVersion is called. (#11748 )	2022-09-04 19:50:38 +02:00
Greg Miller	84cae4f27c	Simplify dense optimization check in TermInSetQuery (#11737 )	2022-09-02 07:51:29 -07:00
Greg Miller	202dd809bd	Ensure TermInSetQuery ScoreSupplier never returns null Scorer	2022-09-01 15:31:14 -07:00
Greg Miller	680f21dca5	LUCENE-10207: TermInSetQuery now provides a ScoreSupplier with cost estimation for use in IndexOrDocValuesQuery (#1058 )	2022-09-01 14:04:43 -07:00
Michael Sokolov	0462a0ad73	fixed index order needed for TestKnnVectorQuery.testScoreEuclidean (#11732 )	2022-09-01 09:53:57 -04:00
Michael Sokolov	1649964f07	Forward-port CHANGES entry for quantized HNSW vectors from 9.x branch	2022-09-01 09:53:46 -04:00
Tomoko Uchida	fd86968fee	remove a link to old Jira in README.	2022-09-01 00:41:56 +09:00
Mayya Sharipova	554fabf682	LUCENE-10633 Disable sort optimization for SortedSetSortField (#3125 ) Add ability to SortedSetSortField to disable sort optimization	2022-08-30 16:52:28 -04:00
Michael Sokolov	61ef031f7f	SimpleText knn vectors; fix searchExhaustively and suppress a byte format test case (#11725 )	2022-08-29 11:49:52 -04:00
Tomoko Uchida	29f94b0404	a bit of clarification about GitHub Milestone	2022-08-28 13:52:58 +09:00
Tomoko Uchida	6d664ccd95	adjast wording	2022-08-27 13:02:48 +09:00
Tomoko Uchida	09a7f9aa53	clarify the relation between CHANGES and Milestone	2022-08-27 12:58:33 +09:00
Tomoko Uchida	224953304c	Document about Milestone for release planning (#11723 )	2022-08-27 12:29:40 +09:00
Tomoko Uchida	e61958e4fd	links to github should be '/issues'	2022-08-27 11:54:20 +09:00
Dawid Weiss	4f7543725c	#11720 Upgrade randomizedtesting to 2.8.1 (#11721 )	2022-08-26 00:01:57 +02:00
Mike Drob	dbc7a9764a	Add Integer awareness to RamUsageEstimator.sizeOf (#11715 ) Additionally, update comments to reflect that we have not been VM cache-aware for a long time now.	2022-08-25 15:18:08 -05:00
Uwe Schindler	1d54299011	Fix classloading deadlock in analysis factories / AnalysisSPILoader initialization. This closes #11701 (#11718 )	2022-08-25 18:16:04 +02:00
Tomoko Uchida	53b1ce7504	update contributing guide for GH issue (#11716 )	2022-08-25 04:06:09 +09:00
Greg Miller	1529606763	Optimize TermInSetQuery for terms that match all docs in a segment (#1062 )	2022-08-23 08:37:44 -07:00
Michael Sokolov	8021c2db4e	Don't throw an exception for byte-encoded vectors in SimpleText codec	2022-08-22 08:29:58 -04:00
Julie Tibshirani	df67223497	Disable byte encoding in TestSimpleTextKnnVectorsFormat	2022-08-21 17:00:57 -07:00
Julie Tibshirani	653d2ebf71	Remove KnnVectorsFormat#currentVersion (#1077 ) These internal versions only make sense within a codec definition, and aren't meant to be exposed and compared across codecs. Since this method is only used in tests, we can move the check to the test classes instead.	2022-08-21 13:09:07 -07:00
Michael Sokolov	daa56d30f0	Fix TestHnswGraph rare failure	2022-08-20 17:26:50 -04:00
Michael Sokolov	0a58318e16	Fix for bad cast when sorting a KnnVectors index over BytesRef (#1074 )	2022-08-20 17:23:47 -04:00
Michael Sokolov	798c02dd70	fix VectorUtil.dotProductScore normalization (#1073 )	2022-08-20 09:15:38 -04:00
Michael Sokolov	60fa19d509	don't call BitSet.cardinality() more than needed (#1075 )	2022-08-20 08:40:50 -04:00
Michael Sokolov	f9680c6807	Add safety checks to KnnVectorField; fixed issue with copying BytesRef (#1076 )	2022-08-20 08:38:42 -04:00
Tomoko Uchida	9ae3498f82	add notes about labels' color code	2022-08-20 13:22:50 +09:00
Julie Tibshirani	8308688d78	LUCENE-9583: Remove RandomAccessVectorValuesProducer (#1071 ) This change folds the `RandomAccessVectorValuesProducer` interface into `RandomAccessVectorValues`. This reduces the number of interfaces and clarifies the cloning/ copying behavior. This is a small simplification related to LUCENE-9583, but does not address the main issue.	2022-08-19 18:04:05 -07:00
Yuting Gan	0914b537db	LUCENE-10644: Facets#getAllChildren testing should ignore child order (#1013 )	2022-08-18 10:38:49 -07:00
Julie Tibshirani	7912ed02c4	Move Lucene91HnswGraphBuilder to test folder It's only used in unit tests so it can live in the backwards_codecs tests.	2022-08-17 17:10:38 -07:00
Tomoko Uchida	8b3303b25f	.asf.yaml	2022-08-16 20:02:47 +09:00
Michael Sokolov	bc214d4958	standardize exception text for vector dimension mismatch (in SimpleText codec)	2022-08-13 13:12:11 -04:00
Nick Knize	543910d900	LUCENE-10654: Fix ShapeDocValue Bounding Box failure (#1066 ) The base spatial test case may create invalid self crossing polygons. These polygons are cleaned by the tessellator which may result in an inconsistent bounding box between the tessellated shape and the original, invalid, geometry. This commit fixes the shape doc value test case to compute the bounding box from the cleaned geometry instead of relying on the, potentially invalid, original geometry. Signed-off-by: Nicholas Walter Knize <nknize@apache.org>	2022-08-12 10:54:22 -05:00
Ignacio Vera	fe8d11254a	LUCENE-10678: Fix potential overflow when computing the partition point on the BKD tree (#1065 ) We currently compute the partition point for a set of points by multiplying the number of nodes that needs to be on the left of the BKD tree by the maxPointsInLeafNode. This multiplication is done on the integer space so if the partition point is bigger than Integer.MAX_VALUE it will overflow. This commit moves the multiplication to the long space so it doesn't overflow.	2022-08-11 15:25:53 +02:00
Michael Sokolov	a693fe819b	LUCENE-10577: enable quantization of HNSW vectors to 8 bits (#1054 ) * LUCENE-10577: enable supplying, storing, and comparing HNSW vectors with 8 bit precision	2022-08-10 17:09:07 -04:00
Vigya Sharma	59a0917e25	Fix typo in PostingsReaderBase docstring (#948 ) * remove extra PostingsEnum from docstring * add ImpactsEnum to docstring	2022-08-09 16:20:51 -07:00
Nick Knize	d7fd48c950	LUCENE-10654: Add new ShapeDocValuesField for LatLonShape and XYShape (#1017 ) Adds new doc value field to support LatLonShape and XYShape doc values. The implementation is inspired by ComponentTree. A binary tree of tessellated components (point, line, or triangle) is created. This tree is then DFS serialized to a variable compressed DataOutput buffer to keep the doc value format as compact as possible. DocValue queries are performed on the serialized tree using a similar component relation logic as found in SpatialQuery for BKD indexed shapes. To make this possible some of the relation logic is refactored to make it accessible to the doc value query counterpart. Note this does not support the following: * Multi Geometries or Collections - This will be investigated by exploring the addition of multi binary doc values. * General Geometry Queries - This will be added in a follow on improvement. Signed-off-by: Nicholas Walter Knize <nknize@apache.org>	2022-08-09 12:51:45 -05:00

1 2 3 4 5 ...

36201 Commits All Branches Search

36201 Commits

All Branches