ReaderPool plays a central role in the IndexWriter pooling NRT readers
and making sure we write buffered deletes and updates to disk. This class
used to be a non-static inner class accessing many aspects including locks
from the IndexWriter itself. This change moves the class outside of IW and
defines it's responsibility in a clear way with respect to locks etc. Now
IndexWriter doesn't need to share ReaderPool anymore and reacts on writes done
inside the pool by checkpointing internally. This also removes acquiring the IW
lock inside the reader pool which makes reasoning about concurrency difficult.
This change also add javadocs and dedicated tests for the ReaderPool class.
* SetAliasPropCmd now calls AliasesManager.update() first.
* SetAliasPropCmd now more efficiently updates multiple values.
* Tests: Commented out BadApple annotations on alias related stuff.
IndexWriter#numDeletesToMerge was creating a ReadersAndUpdates
for all incoming SegmentCommitInfo even if that info wasn't private
to the IndexWriter. This is an illegal use of this API but since it's
transitively public via MergePolicy#findMerges we have to be conservative
with regestiering ReadersAndUpdates. In IndexWriter#numDeletesToMerge we
can only use existing ones. This means for soft-deletes we need to react
earlier in order to produce accurate numbers.
This change partially rolls back the changes in LUCENE-8253. Instead of
registering the readers once they are pulled via IndexWriter#numDeletesToMerge
we now check if segments are fully deleted on flush which is very unlikely and
can be done in a lazy fashion ie. it's only paying the extra cost of opening a
reader and checking all soft-deletes if soft deletes are used and present
in the flushed segment.
This has the side-effect that flushed segments that are 100% hard deleted are also
cleaned up right after they are flushed, previously these segments were sticking
around for a while until they got picked for a merge or received another delete.
This also closes LUCENE-8256
Inside the IndexWriter buffers are only written to disk if it's needed
or "worth it" which doesn't guarantee soft deletes to be accounted
in time. This is not necessarily a problem since they are eventually
collected and segments that have soft-deletes will me merged eventually
but for tests and on par behavior compared to hard deletes this behavior
is tricky.
This change cuts over to accounting in-place just like hard-deletes. This
results in accurate delete numbers for soft deletes at any give point in time
once the reader is loaded or a pending soft delete occurs.
This change also fixes an issue where all updates to a DV field are allowed
event if the field is unknown. Now this only works if the field is equal
to the soft deletes field. This behavior was never released.