Commit Graph

36608 Commits

Author SHA1 Message Date
Alessandro Benedetti af1afc8cb6 * GITHUB#12252 CHANGES.txt fix 2023-06-14 16:00:41 +01:00
Elia Porciani 14c18d8624
GITHUB-12252: Add function queries for computing similarity scores between knn vectors (#12253)
Co-authored-by: Alessandro Benedetti <a.benedetti@sease.io>
2023-06-14 15:49:00 +01:00
Adrien Grand a8baa47733
Move TermAndBoost back to its original location. (#12366)
PR #12169 accidentally moved the `TermAndBoost` class to a different location,
which would break custom sub-classes of `QueryBuilder`. This commit moves it
back to its original location.
2023-06-14 11:54:10 +02:00
Chaitanya Gohel 65447c8388
Add CHANGES.txt for #12334 Honor after value for skipping documents even if queue is not full for PagingFieldCollector (#12368)
Signed-off-by: gashutos <gashutos@amazon.com>
2023-06-14 10:17:06 +02:00
Chris Hegarty 1090928c14
Implement VectorUtilProvider with Java 21 Project Pamana Vector API (#12363)
This commit enables the Panama Vector API for Java 21. The version of
VectorUtilPanamaProvider for Java 21 is identical to that of Java 20.
As such, there is no specific 21 version - the Java 20 version will be
loaded from the MRJAR.
2023-06-13 09:44:58 +01:00
Jonathan Ellis 071461ece5
Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors (#12281)
---------

Co-authored-by: Uwe Schindler <uschindler@apache.org>
2023-06-13 10:40:03 +02:00
gf2121 30eba6df56
Speed up IndexedDISI Sparse #AdvanceExactWithinBlock for tiny step advance (#12324) 2023-06-13 14:24:26 +08:00
Uwe Schindler c8e05c8cd6
Implement MMapDirectory with Java 21 Project Panama Preview API (#12294) 2023-06-12 21:07:04 +02:00
Chris Fournier 41baf23ad9
Restrict GraphTokenStreamFiniteStrings#articulationPointsRecurse recursion depth (#12249) 2023-06-12 18:20:10 +02:00
Uwe Schindler ef35e6edf4
Work around SecurityManager issues during initialization of vector api (JDK-8309727) (#12362) 2023-06-09 22:07:31 +02:00
Alan Woodward a51241e4c9
Better paging when random reads go backwards (#12357)
When reading data from outside the buffer, BufferedIndexInput always resets
its buffer to start at the new read position. If we are reading backwards (for example,
using an OffHeapFSTStore for a terms dictionary) then this can have the effect of
re-reading the same data over and over again.

This commit changes BufferedIndexInput to use paging when reading backwards,
so that if we ask for a byte that is before the current buffer, we read a block of data
of bufferSize that ends at the previous buffer start.

Fixes #12356
2023-06-09 11:57:59 +01:00
fudongying 2934899ca6
feat: soft delete optimize (#12339) 2023-06-09 11:41:28 +02:00
Ignacio Vera 9a2d19324f
[Tessellator] Improve the checks that validate the diagonal between two polygon nodes (#12353) 2023-06-09 08:10:33 +02:00
Peter Gromov 5b63a1879d
TestHunspell: reduce the flakiness probability (#12351)
* TestHunspell: reduce the flakiness probability

We need to check how the timeout interacts with custom exception-throwing checkCanceled.
The default timeout seems not enough for some CI agents, so let's increase it.

Co-authored-by: Dawid Weiss <dawid.weiss@gmail.com>
2023-06-07 14:10:44 +02:00
Patrick Zhai 0c293909c0
Add updateDocuments API which accept a query (reopen) (#12346) 2023-06-03 20:16:16 -07:00
Greg Miller 52ace7eb35
Add "direct to binary" option for DaciukMihovAutomatonBuilder and use it in TermInSetQuery#visit (#12320) 2023-06-02 09:34:52 -07:00
Petr Portnov | PROgrm_JARvis 45110a6a46
Make memory fence in `ByteBufferGuard` explicit (#12290) 2023-06-01 13:41:06 +02:00
Uwe Schindler 40b582ab18
Revert "Add updateDocuments API which accept a query (#12341)" (#12344)
This reverts commit 52ab16731e.
2023-06-01 13:37:36 +02:00
Patrick Zhai 52ab16731e
Add updateDocuments API which accept a query (#12341) 2023-06-01 13:37:04 +02:00
Peter Gromov 4bf1b94209
hunspell (minor): reduce allocations when reading the dictionary's morphological data (#12323)
there can be many entries with morph data, so we'd better avoid compiling and matching regexes and even stream allocation
2023-06-01 11:37:38 +02:00
tang donghai ac8c1870fa
NeighborQueue set incomplemete false when call clear (#12322) 2023-05-31 19:55:21 -07:00
Greg Miller f79b316bd5 Add CHANGES entry for GH#12334 2023-05-31 15:18:34 -07:00
Chaitanya Gohel d44be24025
Fix searchafter high latency when after value is out of range for segment (#12334) 2023-05-31 15:07:53 -07:00
Daniele Antuzi da36c24cb9
Use thread-safe search version of HnswGraphSearcher (#12246)
Addressing comment received in the PR https://github.com/apache/lucene/pull/12246
2023-05-30 15:38:06 +01:00
Luca Cavanna 72b91156f3
Don't generate stacktrace for TimeExceededException (#12335)
The exception is package private and never rethrown, we can avoid
generating a stacktrace for it.
2023-05-30 10:29:46 +02:00
Patrick Zhai d1850e44f3
Update TestVectorUtilProviders.java (#12338) 2023-05-29 16:26:29 -07:00
Uwe Schindler db0c21f25d Clenaup and update changes and synchronize with 9.x 2023-05-26 18:22:51 +02:00
Jonathan Ellis 431dc7b415
add BitSet.clear() (#12268) 2023-05-26 18:13:16 +02:00
Greg Miller 367b03bfc2
GH#12321: Reduce visibility of StringsToAutomaton (#12331) 2023-05-26 08:55:02 -07:00
Uwe Schindler f5f25777d8 Update changes to be correct with ARM (it is called NEON there) 2023-05-26 16:53:39 +02:00
Luca Cavanna 24712d7525 Move changes entry for #12328 to 9.7 2023-05-26 15:11:21 +02:00
Armin Braun fd75807350
Optimize ConjunctionDISI.createConjunction (#12328)
This method is showing up as a little hot when profiling some queries.
Almost all the time spent in this method is just burnt on ceremony
around stream indirections that don't inline.
Moving this to iterators, simplifying the check for same doc id and also saving one iteration (for the min
cost) makes this method far cheaper and easier to read.
2023-05-26 13:44:39 +02:00
Luca Cavanna 0ce6b9a67b Adjust changes entries for knn query concurrent rewrite
Moved entry for #12160 to 9.7.0 as it's been backported.
Added missing entry for #12325.
2023-05-26 09:25:54 +02:00
Luca Cavanna 10bebde269
Parallelize knn query rewrite across slices rather than segments (#12325)
The concurrent query rewrite for knn vectory query introduced with #12160
requests one thread per segment to the executor. To align this with the
IndexSearcher parallel behaviour, we should rather parallelize across
slices. Also, we can reuse the same slice executor instance that the
index searcher already holds, in that way we are using a
QueueSizeBasedExecutor when a thread pool executor is provided.
2023-05-26 09:17:25 +02:00
Uwe Schindler c188d47a8b
Handle jdk.internal classes mentioned in vector superclass or interfaces during extraction (#12329) 2023-05-25 17:21:03 +02:00
Michael McCandless 7da7c43638
#12276: rename DaciukMihovAutomatonBuilder to StringsToAutomaton (#12310)
Closes #12276
2023-05-25 10:18:41 -04:00
Uwe Schindler cf7245e38e Refactor loop to not addAll set to itsself on initial round (followup of #12311) 2023-05-25 09:06:45 +02:00
Chris Hegarty f756f90644
Integrate the Incubating Panama Vector API (#12311)
Leverage accelerated vector hardware instructions in Vector Search.

Lucene already has a mechanism that enables the use of non-final JDK APIs, currently used for the Previewing Pamana Foreign API. This change expands this mechanism to include the Incubating Pamana Vector API. When the jdk.incubator.vector module is present at run time the Panamaized version of the low-level primitives used by Vector Search is enabled. If not present, the default scalar version of these low-level primitives is used (as it was previously).

Currently, we're only targeting support for JDK 20. A subsequent PR should evaluate JDK 21.
---------

Co-authored-by: Uwe Schindler <uschindler@apache.org>
Co-authored-by: Robert Muir <rmuir@apache.org>
2023-05-25 07:59:50 +01:00
Andrey Bozhko c9c49bc553
[MINOR] Update javadoc in Query class (#12233)
- add a few missing full stops
- update wording in the description of Query#equals method
2023-05-23 12:16:50 +02:00
Patrick Zhai 8a602b5063
Add multi-thread searchability to OnHeapHnswGraph (#12257) 2023-05-21 21:48:46 -07:00
Peter Gromov a454388b80
hunspell (minor): reduce allocations when processing compound rules (#12316) 2023-05-19 21:36:05 +02:00
Uwe Schindler 84e2e3afc3
Make sure APIJAR reproduces with different timezone (unfortunately java encodes the date using local timezone) (#12315) 2023-05-19 18:42:55 +02:00
Uwe Schindler a8a95e64ce Forward port references to AccessController in VirtualMethod (#12308) 2023-05-19 16:38:24 +02:00
Jerry Chin 04ef6de826
GITHUB-12291: Skip blank lines from stopwords list. (#12299) 2023-05-18 16:58:32 +02:00
Michael Sokolov 6b51cce0b8 NeighborQueue.reset() now clears incomplete flag 2023-05-18 10:23:22 -04:00
Greg Miller 3e4ca4042c
Minor cleanup and improvements to DaciukMihovAutomatonBuilder (#12305) 2023-05-18 07:01:19 -07:00
Michael Sokolov 2facb3ae0e Revert "allocate one NeighborQueue per search for results (#12255)"
This reverts commit 9a7efe92c0.
2023-05-18 13:42:17 +00:00
Petr Portnov | PROgrm_JARvis 0c6e8aec67
Seal `IndexReader` and `IndexReaderContext` (#12296) 2023-05-17 08:47:47 +02:00
tang donghai f53eb28af0
remove max recursion from Operations.java to AutomatonTestUtil.java (#12298)
Co-authored-by: tangdonghai <tangdonghai@meituan.com>
2023-05-16 07:09:28 -04:00
Patrick Zhai 8af305892d
Optimize HNSW diversity calculation (#12235) 2023-05-15 23:20:31 -07:00