This change removes the ThreadState indirection from DWPTPool and pools DWPT directly. The tracking information and locking semantics are mostly moved to DWPT directly and the pool semantics have changed slightly such that DWPT need to be checked-out in the pool once they need to be flushed or aborted. This automatically grows and shrinks the number of DWPT in the system when number of threads grow or shrink. Access of pooled DWPTs is more straight forward and doesn't require ordinal. Instead consumers can just iterate over the elements in the pool.
This allowed for removal of indirections in DWPTFlushControl like BlockedFlush, the removal of DWPTPool setter and getter in IndexWriterConfig and the addition of stronger assertions in DWPT and DW.
This test failed on Elastic CI because we did not add any term in the
loop. This commit ensures that we always add at least one docId, term
and query in the test.
The SegmentMerger usage in IW#addIndexes(CodecReader...) might make changes
to the Directory while the IW tries to clean-up files on rollback. This
causes issues like FileNotFoundExceptions when IDF tries to remove temp files.
This changes adds a waiting mechanism to the abortMerges method that, in addition
to the running merges, also waits for merges in addIndices(CodecReader...)
* getProcessedFilter now returns null filter if it's all docs more reliably
* getProcessedFilter now documented clearly as an internal method
* getDocSet detects all-docs and exits early with getLiveDocs
* small refactoring to getDocSetBits/makeDocSetBits
Closes#1399
* Some set/clear were not balanced.
* Harden clear() in case of imbalance.
* Sometimes coreContainger.getCore was called unnecessarily; just need a descriptor
* SolrCore.open/close now calls MDCLoggerContext.setCore/clear
* no need to clear MDC in HttpSolrCall
Today a doc values update creates a new field infos file that contains the original field infos updated for the new generation as well as the new fields created by the doc values update.
However existing fields are cloned through the global fields (shared in the index writer) instead of the local ones (present in the segment).
In practice this is not an issue since field numbers are shared between segments created by the same index writer.
But this assumption doesn't hold for segments created by different writers and added through IndexWriter#addIndexes(Directory).
In this case, the field number of the same field can differ between segments so any doc values update can corrupt the index
by assigning the wrong field number to an existing field in the next generation.
When this happens, queries and merges can access wrong fields without throwing any error, leading to a silent corruption in the index.
This change ensures that we preserve local field numbers when creating
a new field infos generation.
ASF Release Policy states that we cannot have binary JAR files checked
in to our source releases, a few other projects have solved this by
modifying their generated gradlew scripts to download a copy of the
wrapper jar.
We now have a version and checksum file in ./gradle/wrapper directory
used for verifying the wrapper jar, and will take advantage of single
source java execution to verify and download.
The gradle wrapper jar will continue to be available in the git
repository, but will be excluded from src tarball generation. This
should not change workflows for any users, since we expect the gradlew
script to get the jar when it is missing.
Co-authored-by: Dawid Weiss <dweiss@apache.org>
This commit introduces a mechanism to control allocation of threads to slices planned for a query.
The default implementation uses the size of backlog queue of the executor to determine if a slice should be allocated a new thread