We have recently increased the likelihood of leveraging inter-segment search
concurrency in tests when `newSearcher` is used to create the index
searcher (see #959). When parallel execution is enabled though, it is
dependent on the number of documents and segments. That means
that out of 1000 test runs that use `RandomIndexWriter` to index a random
number of docs up to 100, we will effectively parallelize only a couple
of times.
This commit increases the likelihood of running concurrent searches by
randomly forcing 1 max segments per slice as well as 1 max doc per slice.
PR #12169 accidentally moved the `TermAndBoost` class to a different location,
which would break custom sub-classes of `QueryBuilder`. This commit moves it
back to its original location.
This commit enables the Panama Vector API for Java 21. The version of
VectorUtilPanamaProvider for Java 21 is identical to that of Java 20.
As such, there is no specific 21 version - the Java 20 version will be
loaded from the MRJAR.
When reading data from outside the buffer, BufferedIndexInput always resets
its buffer to start at the new read position. If we are reading backwards (for example,
using an OffHeapFSTStore for a terms dictionary) then this can have the effect of
re-reading the same data over and over again.
This commit changes BufferedIndexInput to use paging when reading backwards,
so that if we ask for a byte that is before the current buffer, we read a block of data
of bufferSize that ends at the previous buffer start.
Fixes#12356
* TestHunspell: reduce the flakiness probability
We need to check how the timeout interacts with custom exception-throwing checkCanceled.
The default timeout seems not enough for some CI agents, so let's increase it.
Co-authored-by: Dawid Weiss <dawid.weiss@gmail.com>
This method is showing up as a little hot when profiling some queries.
Almost all the time spent in this method is just burnt on ceremony
around stream indirections that don't inline.
Moving this to iterators, simplifying the check for same doc id and also saving one iteration (for the min
cost) makes this method far cheaper and easier to read.
The concurrent query rewrite for knn vectory query introduced with #12160
requests one thread per segment to the executor. To align this with the
IndexSearcher parallel behaviour, we should rather parallelize across
slices. Also, we can reuse the same slice executor instance that the
index searcher already holds, in that way we are using a
QueueSizeBasedExecutor when a thread pool executor is provided.
Leverage accelerated vector hardware instructions in Vector Search.
Lucene already has a mechanism that enables the use of non-final JDK APIs, currently used for the Previewing Pamana Foreign API. This change expands this mechanism to include the Incubating Pamana Vector API. When the jdk.incubator.vector module is present at run time the Panamaized version of the low-level primitives used by Vector Search is enabled. If not present, the default scalar version of these low-level primitives is used (as it was previously).
Currently, we're only targeting support for JDK 20. A subsequent PR should evaluate JDK 21.
---------
Co-authored-by: Uwe Schindler <uschindler@apache.org>
Co-authored-by: Robert Muir <rmuir@apache.org>