mirror of https://github.com/apache/lucene.git
Fix Lucene94HnswVectorsFormat validation on large segments (#11861)
When reading large segments, the vectors format can fail with a validation error: java.lang.IllegalStateException: Vector data length 3070061568 not matching size=999369 * dim=768 * byteSize=4 = -1224905728 The problem is that we use an integer to represent the size, which is too small to hold it. The bug snuck in during the work to enable int8 values, which switched a long value to an int.
This commit is contained in:
parent
6cde41c9fd
commit
0f525bfb14
|
@ -157,6 +157,14 @@ Other
|
|||
|
||||
* LUCENE-10635: Ensure test coverage for WANDScorer by using a test query. (Zach Chen, Adrien Grand)
|
||||
|
||||
======================== Lucene 9.4.1 =======================
|
||||
|
||||
Bug Fixes
|
||||
---------------------
|
||||
* GITHUB#11858: Fix kNN vectors format validation on large segments. This
|
||||
addresses a regression in 9.4.0 where validation could fail, preventing
|
||||
further writes or searches on the index. (Julie Tibshirani)
|
||||
|
||||
======================== Lucene 9.4.0 =======================
|
||||
|
||||
API Changes
|
||||
|
|
|
@ -175,7 +175,8 @@ public final class Lucene94HnswVectorsReader extends KnnVectorsReader {
|
|||
case BYTE -> Byte.BYTES;
|
||||
case FLOAT32 -> Float.BYTES;
|
||||
};
|
||||
int numBytes = fieldEntry.size * dimension * byteSize;
|
||||
long vectorBytes = Math.multiplyExact((long) dimension, byteSize);
|
||||
long numBytes = Math.multiplyExact(vectorBytes, fieldEntry.size);
|
||||
if (numBytes != fieldEntry.vectorDataLength) {
|
||||
throw new IllegalStateException(
|
||||
"Vector data length "
|
||||
|
|
Loading…
Reference in New Issue