Jan Høydahl
00a8112d97
LUCENE-10365 Wizard changes contributed from Solr ( #591 )
2022-09-20 12:07:42 +02:00
Alex
26d6063ec3
GitHub Workflows security hardening ( #11789 )
2022-09-20 11:28:07 +02:00
Ignacio Vera
ecb0ba542b
Improve tessellator performance by delaying calls to the method #isIntersectingPolygon ( #11786 )
2022-09-20 07:15:38 +02:00
Michael Sokolov
accc3bdcfa
update DOAP and releaseWizard to reflect migration to github ( #11747 )
2022-09-19 13:53:26 -04:00
Michael Sokolov
07af358f90
Diversity check bugfix ( #11781 )
...
* Fixes bug in HNSW diversity checks introduced in LUCENE-10577
2022-09-19 11:48:59 -04:00
Michael Sokolov
e69c48b8d9
Fix rare bug in TestKnnVectorQuery when we have multiple segments
2022-09-18 20:21:39 +00:00
Namgyu Kim
451bab300e
GITHUB#11778: Add detailed part-of-speech tag for particle and ending on Nori ( #11779 )
2022-09-17 00:42:35 +09:00
Adrien Grand
155876a902
LUCENE-10674: Move changes entry to 9.4.
2022-09-16 16:59:42 +02:00
Dawid Weiss
9acc653995
GH-11172: remove WindowsDirectory and native subproject. ( #11774 )
2022-09-15 16:22:46 +02:00
John Mazanec
0587844742
LUCENE-10674: Ensure BitSetConjDISI returns NO_MORE_DOCS when sub-iterator exhausts. ( #1068 )
...
Signed-off-by: John Mazanec <jmazane@amazon.com>
2022-09-15 11:21:39 +02:00
Alexander Münch
5de685cfba
Removed duplicate check in SpanGradientFormatter ( #11762 )
2022-09-14 13:37:31 +01:00
Adrien Grand
a426c6fec3
Fix integer overflow in tests.
2022-09-13 17:08:17 +02:00
Greg Miller
4463a0b271
GITHUB#11742: MatchingFacetSetsCounts#getTopChildren now returns top children instead of all children ( #11764 )
2022-09-13 06:50:52 -07:00
Dawid Weiss
e491ef797c
Retry gradle wrapper download on http 500 and 503. ( #11766 )
2022-09-13 10:30:20 +02:00
Dhiru Kholia
30b72ec364
Fix a typo affecting Luke ( #11763 )
2022-09-12 13:05:40 +02:00
Alan Woodward
41d03f69ce
Fix IntervalBuilder.NO_INTERVALS docId when unpositioned ( #11760 )
...
IntervalBuilder.NO_INTERVALS should return -1 when unpositioned,
not NO_MORE_DOCS. This can trigger exceptions when an empty
IntervalQuery is combined in a conjunction.
Fixes #11759
2022-09-09 17:19:15 +01:00
Mayya Sharipova
0ea8035612
LUCENE-10592 Better estimate memory for HNSW graph ( #11743 )
...
Better estimate memory used for OnHeapHnswGraph,
as well as add tests.
Also don't overallocate arrays in NeighborArray
Relates to #992
2022-09-08 16:54:29 -04:00
Yuting Gan
49b596ef02
Added a top-n range faceting example ( #1035 )
2022-09-08 12:19:42 -07:00
Julie Tibshirani
09a13aeaf2
LUCENE-10577: Remove LeafReader#searchNearestVectorsExhaustively ( #11756 )
...
This PR removes the recently added function on LeafReader to exhaustively search
through vectors, plus the helper function KnnVectorsReader#searchExhaustively.
Instead it performs the exact search within KnnVectorQuery, using a new helper
class called VectorScorer.
2022-09-08 12:15:02 -07:00
Robert Muir
f4146a44e9
Fix TestIndexWriterOnDiskFull.testAddDocumentOnDiskFull to handle IllegalStateException from startCommit() ( #11757 )
...
If ConcurrentMergeScheduler is used, and the merge hits fatal exception (such as disk full) after prepareCommit()'s ensureOpen() check, then startCommit() will throw IllegalStateException instead of AlreadyClosedException.
The test is currently not prepared to handle this: the logic is only geared around exceptions coming from addDocument()
Closes #11755
2022-09-08 13:35:54 -04:00
Adrien Grand
f8285fd0fe
Prevent term vectors from exceeding the maximum dictionary size. ( #11726 )
...
When indexing term vectors for a very large document, the automatic computation
of the dictionary size based on the overall size of the block might yield a
size that exceeds the maximum window size that is supported by LZ4. This commit
addresses the issue by automatically taking the minimum of the result of this
computation and the maximum window size (64kB).
2022-09-08 13:44:21 +02:00
Marios Trivyzas
dbffe3472b
LUCENE-10423: Remove usages of System.currentTimeMillis() from tests ( #11749 )
...
* Remove usages of System.currentTimeMillis() from tests
- Use Random from `RandomizedRunner` to be able to use a Seed to
reproduce tests, instead of a seed coming from wall clock.
- Replace time based tests, using wall clock to determine periods
with counter of repetitions, to have a consistent reproduction.
Closes : #11459
* address comments
* tune iterations
* tune iterations for nightly
2022-09-06 17:55:01 -04:00
Dawid Weiss
d3460fa1bb
Add tidy after addVersion is called. ( #11748 )
2022-09-04 19:50:38 +02:00
Greg Miller
84cae4f27c
Simplify dense optimization check in TermInSetQuery ( #11737 )
2022-09-02 07:51:29 -07:00
Greg Miller
202dd809bd
Ensure TermInSetQuery ScoreSupplier never returns null Scorer
2022-09-01 15:31:14 -07:00
Greg Miller
680f21dca5
LUCENE-10207: TermInSetQuery now provides a ScoreSupplier with cost estimation for use in IndexOrDocValuesQuery ( #1058 )
2022-09-01 14:04:43 -07:00
Michael Sokolov
0462a0ad73
fixed index order needed for TestKnnVectorQuery.testScoreEuclidean ( #11732 )
2022-09-01 09:53:57 -04:00
Michael Sokolov
1649964f07
Forward-port CHANGES entry for quantized HNSW vectors from 9.x branch
2022-09-01 09:53:46 -04:00
Tomoko Uchida
fd86968fee
remove a link to old Jira in README.
2022-09-01 00:41:56 +09:00
Mayya Sharipova
554fabf682
LUCENE-10633 Disable sort optimization for SortedSetSortField ( #3125 )
...
Add ability to SortedSetSortField to disable sort optimization
2022-08-30 16:52:28 -04:00
Michael Sokolov
61ef031f7f
SimpleText knn vectors; fix searchExhaustively and suppress a byte format test case ( #11725 )
2022-08-29 11:49:52 -04:00
Tomoko Uchida
29f94b0404
a bit of clarification about GitHub Milestone
2022-08-28 13:52:58 +09:00
Tomoko Uchida
6d664ccd95
adjast wording
2022-08-27 13:02:48 +09:00
Tomoko Uchida
09a7f9aa53
clarify the relation between CHANGES and Milestone
2022-08-27 12:58:33 +09:00
Tomoko Uchida
224953304c
Document about Milestone for release planning ( #11723 )
2022-08-27 12:29:40 +09:00
Tomoko Uchida
e61958e4fd
links to github should be '/issues'
2022-08-27 11:54:20 +09:00
Dawid Weiss
4f7543725c
#11720 Upgrade randomizedtesting to 2.8.1 ( #11721 )
2022-08-26 00:01:57 +02:00
Mike Drob
dbc7a9764a
Add Integer awareness to RamUsageEstimator.sizeOf ( #11715 )
...
Additionally, update comments to reflect that we have not been VM cache-aware for a long time now.
2022-08-25 15:18:08 -05:00
Uwe Schindler
1d54299011
Fix classloading deadlock in analysis factories / AnalysisSPILoader initialization. This closes #11701 ( #11718 )
2022-08-25 18:16:04 +02:00
Tomoko Uchida
53b1ce7504
update contributing guide for GH issue ( #11716 )
2022-08-25 04:06:09 +09:00
Greg Miller
1529606763
Optimize TermInSetQuery for terms that match all docs in a segment ( #1062 )
2022-08-23 08:37:44 -07:00
Michael Sokolov
8021c2db4e
Don't throw an exception for byte-encoded vectors in SimpleText codec
2022-08-22 08:29:58 -04:00
Julie Tibshirani
df67223497
Disable byte encoding in TestSimpleTextKnnVectorsFormat
2022-08-21 17:00:57 -07:00
Julie Tibshirani
653d2ebf71
Remove KnnVectorsFormat#currentVersion ( #1077 )
...
These internal versions only make sense within a codec definition, and aren't
meant to be exposed and compared across codecs. Since this method is only used
in tests, we can move the check to the test classes instead.
2022-08-21 13:09:07 -07:00
Michael Sokolov
daa56d30f0
Fix TestHnswGraph rare failure
2022-08-20 17:26:50 -04:00
Michael Sokolov
0a58318e16
Fix for bad cast when sorting a KnnVectors index over BytesRef ( #1074 )
2022-08-20 17:23:47 -04:00
Michael Sokolov
798c02dd70
fix VectorUtil.dotProductScore normalization ( #1073 )
2022-08-20 09:15:38 -04:00
Michael Sokolov
60fa19d509
don't call BitSet.cardinality() more than needed ( #1075 )
2022-08-20 08:40:50 -04:00
Michael Sokolov
f9680c6807
Add safety checks to KnnVectorField; fixed issue with copying BytesRef ( #1076 )
2022-08-20 08:38:42 -04:00
Tomoko Uchida
9ae3498f82
add notes about labels' color code
2022-08-20 13:22:50 +09:00