* When the schema defines _root_, and you want to do atomic/partial updates...
** _root_ needn't be stored or have docValues any more
** _nest_path_ field isn't needed for this any more
** Simplified internal logic
* Allow (and recommend, eventually insist) that the _root_ field be passed for atomic/partial updates to child docs.
** In the absence of _root_, assume the _route_ param is equivalent to ameliorate back-compat scope. This is a temporary hack; remove in SOLR-15064.
** One of the two is required; you'll get an exception if the assumption is false. THIS IS A BACK-COMPAT CHANGE
* Ensure that the update log contains the _root_ field if it's defined in the schema; in some cases it wasn't. It's important for robustness of atomic/partial updates to child docs. Caveat: the buffer replay scenario is not tested with child docs.
* Limited the cases when a realtime searcher is re-opened. It was being applied to any update that included child docs but now only some narrow subset: only for atomic/partial updates, and when the update log contains an in-place update for the same nest because it's complicated to resolve those log entries.
* Internal improvements to RealTimeGetComponent to aid clarity & robustness & probably performance...
** Use SolrDocumentFetcher.solrDoc(docID, ReturnFields) instead of more manual loading. Will do more with this in another PR.
** Clarify when only root doc IDs are expected.
** Use Resolution enum more, add PARTIAL, remove DOC_WITH_CHILDREN; enhance docs.
** When have ReturnFields, a Set of "onlyTheseFields" becomes redundant. Add a child doc resolution via a transformer when needed.
** Clarified where copy-field targets are removed
* NestPathField should default to single valued, instead of inheriting the schema default, which for ancient schemas was multi-valued.
* AddUpdateCommand.getLuceneDocument(s) methods are very internal; made package visible and refactored a bit for clarity
* DocumentBuilder: when in-place update, skip id and _root_ here, thus also simplifying further logic
* NestedShardedAtomicUpdateTest no longer extends AbstractFullDistribZkTestBase because it wasn't really leveraging the "control client" checking, and it added too much complexity to debug failures.
missing, allBuckets, and numBuckets is not supported with stream method.
So, avoiding picking stream method when any one of them is enabled even if
facet sort is 'index asc'
CopyFields are regenerated in case of replace-field or replace-field-type.
While regenerating, source and destionation are checked against fields but source/dest
could match dynamic rule too.
For example,
<copyField source="something_s" dest="spellcheck"/>
<dynamicField name="*_s" type="string"/>
here, something_s is not present in schema but matches the dynamic rule.
To handle the above case, need to check dynamicFieldCache too while regenerating the
copyFields
Return proper error code on invalid value with in-place update.
Handle invalid value for inc op with the in-place update, uses toNativeType to convert increment value instead of direct parsing. Also, return an error when inc operation is specified for the non-numeric field
Context queries check if the wrapped automaton is empty, and if so return an
empty automaton. This commit improves the check for empty automata, which
allows for handling an empty PrefixCompletionQuery as well.
* Additional options to KnnGraphTester to support benchmarking with ann-benchmarks
* switch to parallel array-based storage in HnswGraph (was using LongHeap)
* LUCENE-9617: Reset lowestUnassignedFieldNumber in FieldNumbers.clear()
FieldNumbers.clear() is called from IndexWriter.deleteAll(), which is
supposed to completely reset the state of the index. This includes
clearing all known fields.
Prior to this change, it would allocate progressively higher field
numbers, which results in larger and larger arrays for
FieldInfos.byNumber, effectively "leaking" field numbers every time
deleteAll() is called.
Co-authored-by: Michael Froh <froh@amazon.com>