Michael McCandless
88588e3dea
LUCENE-10052: cutover more tests to newBytesRef, and finally catches a fly (FSTTermsReader.IntersectEnum was illegally ignoring BytesRef.offset, yay!) ( #258 )
2021-08-25 12:18:23 -04:00
Adrien Grand
8917fbe039
LUCENE-9613, LUCENE-10067: Add more specialization for the ordinals case.
2021-08-25 14:34:04 +02:00
Dawid Weiss
45868a52f1
LUCENE-9990: upgrade to gradle 7.2.
2021-08-25 10:04:42 +02:00
Dawid Weiss
0d07104de0
Piggyback spotless upgrade to 5.14.3
2021-08-25 10:03:59 +02:00
Dawid Weiss
a8d4f658de
Upgrade to gradle 7.2
2021-08-25 10:03:59 +02:00
Dawid Weiss
0cbafa4879
Fix gradle error hints.
2021-08-25 10:03:59 +02:00
Dawid Weiss
fdccdee734
Move logging to info leve.
2021-08-25 10:03:59 +02:00
Dawid Weiss
26eb84a3b5
Fix immutable properties. Fix ant uri namespace no longer working (seems like gradle regression).
2021-08-25 10:03:59 +02:00
Dawid Weiss
2b0378cd4a
Use JavaInfo instead of toolchains. Internal but works and is free of toolchain's quirks.
2021-08-25 10:03:59 +02:00
Dawid Weiss
68cf86ba35
Experiments with the new apis.
2021-08-25 10:03:59 +02:00
Dawid Weiss
72f373791e
Upgrade palantir's plugin.
2021-08-25 10:03:59 +02:00
Dawid Weiss
3ff4263535
Upgrade gradle to 7.1.1
2021-08-25 10:03:59 +02:00
Dawid Weiss
523cea2c5d
Revert "Adding initial patch by Gautam Worah" (restore pristine main)
...
This reverts commit 067ab4f503aabea59639e692e3ea9ee30750c68e.
2021-08-25 10:03:59 +02:00
Dawid Weiss
bac22d6116
Adding initial patch by Gautam Worah
2021-08-25 10:03:59 +02:00
Mayya Sharipova
fc67d6aa6e
Revert "LUCENE-10054 Make HnswGraph hierarchical ( #250 )"
...
This reverts commit 257d256def
.
We've decided to have a separate feature branch for HNSW,
and put all related changes there.
2021-08-24 14:58:59 -04:00
Julie Tibshirani
782c3cca3a
LUCENE-10040: Relax TestKnnVectorQuery#testDeletes assertion ( #251 )
...
TestKnnVectorQuery#testDeletes assumes that if there are n total documents, we
can perform a kNN search with k=n and retrieve all documents. This isn't true
with our implementation -- due to randomization we may select less than n entry
points and never visit some vectors.
2021-08-24 11:15:27 -07:00
Adrien Grand
83ba5d859c
LUCENE-7020: Remove TieredMergePolicy#setMaxMergeAtOnceExplicit. ( #230 )
...
TieredMergePolicy no longer bounds the number of segments that can be merged via
a forced merge.
2021-08-24 10:27:00 +02:00
Mayya Sharipova
257d256def
LUCENE-10054 Make HnswGraph hierarchical ( #250 )
...
Currently HNSW has only a single layer.
This is the first part to make it multi-layered.
To keep changes small, this PR only adds
multiple layers in the HnswGraph class.
TODO for following PRs:
- modify graph construction and search algorithm for a hierarchical
graph.
- modify Lucene90HnswVectorsWriter and Lucene90HnswVectorsReader to
write and read multiple layers\
2021-08-23 15:54:26 -04:00
Greg Miller
46fa09d265
LUCENE-5309: Optimize facet counting for single-valued SSDV / StringValueFacetCounts ( #255 )
2021-08-23 10:01:23 -07:00
51search
191ee3ad3e
LUCENE-10058: fix gradle lucene:benchmark:run error ( #253 )
2021-08-23 10:36:33 -04:00
Uwe Schindler
5813292de2
LUCENE-10055: Update Subversion foder for Javadocs
2021-08-22 13:34:09 +02:00
Michael Sokolov
054b444c14
Fix off-by-one in TestDemo.testKnnVectorSearch
2021-08-21 14:22:47 -04:00
Mike Drob
c36495dce7
LUCENE-10017 Less verbose exception on IndexFormatTooOld ( #200 )
2021-08-20 15:40:52 -05:00
Dzung Bui
0c3c8ec09a
LUCENE-10059: Fix an AssertionError when JapaneseTokenizer tries to backtrace from and to the same position ( #254 )
...
Co-authored-by: Anh Dung Bui <buidun@amazon.com>
2021-08-20 08:21:58 -04:00
Michael Sokolov
5896e5389a
LUCENE-10057: Use Lucene abstractions to store demo KnnVectorDict (Dawid Weiss)
2021-08-19 16:14:06 -04:00
Michael Sokolov
eeb296ce90
LUCENE-8638: remove LegacyBM25Similarity
2021-08-18 15:44:56 -04:00
Michael Sokolov
b8210dee7a
Close vector dictionary when exiting the demo
2021-08-18 15:43:33 -04:00
Michael Sokolov
d1d60e2db6
LUCENE-8638: remove unused deprecated methods and related tests ( #248 )
2021-08-18 08:19:49 -04:00
Michael Sokolov
666c7a2590
LUCENE-8638: remove deprecated FST get by output
2021-08-18 08:15:31 -04:00
Michael Sokolov
a37844aedd
LUCENE-10016: Added KnnVector index/query support to demo
2021-08-18 08:13:59 -04:00
Michael Sokolov
4213f9d3cd
LUCENE-8638: remove long-deprecated Jaspell suggester
2021-08-17 17:45:22 -04:00
Michael McCandless
65a53450dc
LUCENE-10052: first cut at LTC.newBytesRef methods, and switching a few test cases over ( #245 )
...
* LUCENE-10052: first cut at LTC.newBytesRef methods, to randomize the offset/length of a BytesRef, and switching a few test cases over
2021-08-17 16:18:40 -04:00
Michael Sokolov
2d21a600ba
LUCENE-8638: remove deprecated code ( #243 )
2021-08-17 13:51:04 -04:00
Julie Tibshirani
29ed3908ea
LUCENE-9614: Small fixes to KnnVectorQuery hashCode and toString
2021-08-16 09:10:53 -07:00
Julie Tibshirani
e48be684b2
LUCENE-9614: Prevent TestKnnVectorQuery from using simple text codec ( #244 )
...
The simple text codec doesn't support kNN searches, so the test will fail when
we randomly chose to use it.
2021-08-16 09:11:03 -07:00
Julie Tibshirani
6993fb9a99
LUCENE-10040: Handle deletions in nearest vector search ( #239 )
...
This PR extends VectorReader#search to take a parameter specifying the live
docs. LeafReader#searchNearestVectors then always returns the k nearest
undeleted docs.
To implement this, the HNSW algorithm will only add a candidate to the result
set if it is a live doc. The graph search still visits and traverses deleted
docs as it gathers candidates.
2021-08-16 07:44:17 -07:00
Mike McCandless
19e5c00a4f
LUCENE-10014: fix performance bug: when writing doc values with block GCD compression we were unnecessarily wasting index storage by failing to take fully advantage of the GCD compression
2021-08-16 08:40:02 -04:00
Mike McCandless
b18f714096
LUCENE-10008: add CHANGES entry
2021-08-13 14:47:53 -04:00
Vigya Sharma
cb4c8ae07f
Lucene-10008: Respect ignoreCase flag in CommonGramsFilterFactory and factor out a common abstract base class AbstractWordsFileFilterFactory.java ( #188 )
2021-08-13 14:45:58 -04:00
Michael Sokolov
624560a3d7
LUCENE-9614: add KnnVectorQuery implementation
2021-08-13 12:15:40 -04:00
Julie Tibshirani
a9fb5a965d
LUCENE-10043: Decrease default LRUQueryCache#skipCacheFactor to 10 ( #232 )
...
In LUCENE-9002 we introduced logic to skip caching a clause if it would be too
expensive compared to the usual query cost. Specifically, we avoid caching a
clause if its cost is estimated to be a 250x higher than the lead iterator's.
We've found that the default of 250 is quite high and can lead to poor tail
latencies. This PR decreases it to 10 to cache more conservatively.
2021-08-11 13:29:12 +03:00
Mike McCandless
931ff63232
LUCENE-9963: add CHANGES entry
2021-08-09 16:11:31 -04:00
Geoffrey Lawson
647255b4d2
LUCENE-9963 Improve FlattenGraphFilter's robustness when handling incoming token graphs with holes ( #157 )
...
6 main improvements:
1) Iterate through all output.InputNodes since dest gaps can exist.
2) freeBefore the minimum input node instead of the first input node(which was usually, but not always, the minimum).
3) Don't freeBefore from a hole source node. Book keeping may not be correct and could result in an early free.
4) When adding an output node after hole recovery, calculate its new position increment instead of adding it to the end of the output graph.
5) Nodes after holes that have edges to their source will do the output re-mapping that the deleted node would have done.
6) If a disconnected input node swaps order with another node in the output, then map them to the same output node.
Co-authored-by: Lawson <geoffrl@amazon.com>
2021-08-09 16:06:53 -04:00
Greg Miller
a11457b4e6
LUCENE-10047: Fix value de-duping check in LongValueFacetCounts and RangeFacetCounts ( #237 )
2021-08-07 10:20:49 -07:00
Greg Miller
e937e739f3
LUCENE-10046: Fix counting bug in StringValueFacetCounts ( #236 )
2021-08-07 07:32:50 -07:00
Greg Miller
3037e33025
Slight improvement/optimization to duplicate facet value checking (ref: LUCENE-9964) ( #234 )
2021-08-06 12:57:09 -07:00
Greg Miller
645b64ef4e
Update CHANGES entry for LUCENE-9945 after backporting
2021-08-02 16:38:10 -07:00
Sejal Pawar
a76f2f8072
LUCENE-9945: Extend DrillSidewaysResult to expose drillDowns and drillSideways ( #159 )
2021-08-02 16:01:08 -07:00
Greg Miller
7450a7e64b
Update CHANGES entry for LUCENE-10030 after backporting
2021-08-01 12:39:11 -07:00
Dawid Weiss
b016c8dc2a
LUCENE-10042: JAR minimal manifest JDK entries are incorrectly set to build-JVM
2021-08-01 14:14:42 +02:00