TermInSetQuery currently iterates through all its prefix-encoded terms
in order to build an array to pass back to its visitor when visit() is called.
This seems like a waste, particularly when the visitor is not actually
consuming the terms (for example, when doing a clause-count check
before executing a search). This commit changes TermInSetQuery to use
consumeTermsMatching(), and also changes the signature of this method so
that it takes a BytesRunAutomaton supplier to allow for lazy instantiation. In
addition, IndexSearcher's clause count check wasn't counting leaves that
called consumeTermsMatching().
Today it looks like wild wild west inside IndexWriter and some of it's
associated classes. This change makes sure all non-final members have
private visibility, methods that are not used outside of IW today are
made private unless they have been public. This change also removes
some unused or unnecessary members where possible and deleted some dead
code from previous refactoring.
Today we still have one class that runs some tricky logic that should
be in the IndexWriter in the first place since it requires locking on
the IndexWriter itself. This change inverts the API and now FrozendBufferedUpdates
does not get the IndexWriter passed in, instead the IndexWriter owns most of the logic
and executes on a FrozenBufferedUpdates object. This prevent locking on IndexWriter out
side of the writer itself and paves the way to simplify some concurrency down the road
This change extracts the methods that are used by MergeScheduler into
a MergeSource interface. This allows IndexWriter to better ensure
locking, hide internal methods and removes the tight coupling between the two
complex classes. This will also improve future testing.
This method is trappy; it doesn't work for all SortField types, but doesn't tell
you that until runtime. This commit deprecates it, and removes all other
callsites in the codebase.
Replaces SimpleBindings' Map<String, Object> with a map of
Function<Bindings, DoubleValuesSource> to improve type safety, and
reworks cycle detection and validation to avoid catching
StackOverflowException
This test produced tons of files on nighly builds causing
TooManyOpenFilesExceptions likely due to not using CFS on flush
and/or very small maxMergeSize values.
IW#maybeMerge calls the MergeScheduler even if it didn't find any merges we should instead only do this if there is in-fact anything there to merge and safe the call into a sync'd method.
CMS today releases it's lock after finishing a merge before it re-acquires it to update
the thread accounting datastructures. This causes threading issues where concurrently
finishing threads fail to pick up pending merges causing potential thread starvation
on forceMerge calls.