FieldExistsQuery checks if there are points for a certain field, and then retrieves the
corresponding point values. When all documents that had points for a certain field have
been deleted from a certain segments, as well as merged away, field info may report
that there are points yet the corresponding point values are null.
With this change we add a null check in FieldExistsQuery. Long term, we will likely want
to prevent this situation from happening.
Relates #11393
Introduction of dynamic pruning for string sorts (#11669) introduced a bug with
string sorts and ghost fields, triggering a `NullPointerException` because the
code assumes that `LeafReader#terms` is not null if the field is indexed
according to field infos.
This commit fixes the issue and adds tests for ghost fields across all sort
types.
Hopefully we can simplify and remove the null check in the future when we
improve handling of ghost fields (#11393).
IntervalBuilder.NO_INTERVALS should return -1 when unpositioned,
not NO_MORE_DOCS. This can trigger exceptions when an empty
IntervalQuery is combined in a conjunction.
Fixes#11759
This PR removes the recently added function on LeafReader to exhaustively search
through vectors, plus the helper function KnnVectorsReader#searchExhaustively.
Instead it performs the exact search within KnnVectorQuery, using a new helper
class called VectorScorer.
If ConcurrentMergeScheduler is used, and the merge hits fatal exception (such as disk full) after prepareCommit()'s ensureOpen() check, then startCommit() will throw IllegalStateException instead of AlreadyClosedException.
The test is currently not prepared to handle this: the logic is only geared around exceptions coming from addDocument()
Closes#11755
When indexing term vectors for a very large document, the automatic computation
of the dictionary size based on the overall size of the block might yield a
size that exceeds the maximum window size that is supported by LZ4. This commit
addresses the issue by automatically taking the minimum of the result of this
computation and the maximum window size (64kB).
* Remove usages of System.currentTimeMillis() from tests
- Use Random from `RandomizedRunner` to be able to use a Seed to
reproduce tests, instead of a seed coming from wall clock.
- Replace time based tests, using wall clock to determine periods
with counter of repetitions, to have a consistent reproduction.
Closes: #11459
* address comments
* tune iterations
* tune iterations for nightly
These internal versions only make sense within a codec definition, and aren't
meant to be exposed and compared across codecs. Since this method is only used
in tests, we can move the check to the test classes instead.