LUCENE-10384 and PR#615 introduced encoding f into NeighborQueue.
But one function `nodes()` was remained to add this encoding.
Also modify the test that would fail without this patch.
In case only number of documents are collected, IndexSearcher#search(Query, Collector) is commonly used, which does not use the executor that's been eventually set to the searcher. Calling `IndexSearcher#count(Query)` makes the code more concise and is also more correct as it honours the executor that's been set to the searcher instance.
Co-authored-by: Adrien Grand <jpountz@gmail.com>
In a previous commit, we updated HNSW merge to first write the combined segment
vectors to a file, then use that file to build the graph. This commit applies
the same strategy to flush, which lets us use the same logic for flush and
merge.
When merging segments together, the `KnnVectorsWriter` creates a `VectorValues`
instance with a merged view of all the segments' vectors. This merged instance
is used when constructing the new HNSW graph. Graph building needs random
access, and the merged VectorValues support this by mapping from merged
ordinals to segments and segment ordinals. This mapping can add significant
overhead when building the graph.
This change updates the HNSW merging logic to first write the combined segment
vectors to a file, then use that the file to build the graph. This helps speed
up segment merging, and also lets us simplify `VectorValuesMerger`, which
provides the merged view of vector values.
The sort position parameter in SortField.getComparator() is only ever used
to determine whether or not skipping should be enabled on a given comparator,
so the parameter name should reflect that. This commit also explicitly disables
skipping in a number of cases where it is never used, in particular CheckIndex
and the grouping collectors.
1. Correct the remaining size for input files larger
than Integer.MAX_VALUE, as currently with every
iteration we try to map the next blockSize of bytes
even if less < blockSize bytes are left in the file.
2. Correct java.lang.ClassCastException when retrieving
KnnGraphValues for stats printing.
3. Add an option for euclidean metric
This test was occasionally failing on CI, as the test randomly installed a merge policy
that would force compound file creation while the goal of the test was not to do so.