The recent addition of support for a "readonly" mode for collections
opens the door to restoring to already-existing collections.
This commit adds a codepath to allow this. Any compatible existing
collection may be used for restoration, including the collection that
was the original source of the backup.
* relocate xslt related classes into scripting contrib
* relocating files to scripting and seperating out unit tests
* relocate files under test-files/scripting/solr, similar to how we do it in other contribs. deals with some issues in finding files
* Reformatting using the Google Java Format...
* use actual param name, not the variable to properly test api!
* Clean up references to paths, and deal with the mish mash of Xslt and XSLT in class names.
* Move XSLT processing out of XMLLoader
* Move TransformerProvider.Dedupe getTransformer logic.
Co-authored-by: epugh@opensourceconnections.com <>
Co-authored-by: David Smiley <dsmiley@apache.org>
SOLR-13608 introduces a new "incremental" backup format, which allows
storage of multiple backup "points" in the same location. This
development introduces a need for APIs to manage these potentially
plural backups.
This commit introduces /admin/collections?action=LISTBACKUPS and
/admin/collections?action=DELETEBACKUP to handle these backups.
This commit introduces a new way for Solr to do backups (with a new
underlying file structure). This new "incremental" backup process
improves over the existing backup mechanism in several ways:
- multiple backups "points" can now be stored at a given backup
location/name, allowing users to choose which point in time they want
to restore
- subsequent backups skip over uploading files that were uploaded by
previous backups, saving time and network time.
- files are checksumed as they're uploaded, ensuring that corrupted
indices aren't persisted and accidentally restored later.
Incremental backups are now the default, and traditional backups
should now be considered 'deprecated' but can still be created by
passing an `incremental=false` parameter on backup requests.
* Creating Scripting contrib module to centralize the less secure code related to scripts.
* tweak the changelog and update notice to explain why the name changed and the security posture thinking
* the test script happens to be a currency.xml, which made me think we were doing something specific to currency types, but instead any xml formatted file will suffice for the test.
* Update solr/contrib/scripting/src/java/org/apache/solr/scripting/update/ScriptUpdateProcessorFactory.java
* Update solr/contrib/scripting/src/java/org/apache/solr/scripting/update/package-info.java
* drop the ing, and be more specific on the name of the ref guide page
* comment out the script update chain.
The sample techproducts configSet is used by many of the solr unit tests, and by default doesn't have access to the jar file in the contrib module. This is commented out, similar to how the lang contrib is.
* using a Mock for the script processor in order to keep the trusted configSets tests all together.
* tweak since we are using a mock script processor
Co-authored-by: David Smiley <dsmiley@apache.org>
* Creating Scripting contrib module to centralize the less secure code related to scripts.
* tweak the changelog and update notice to explain why the name changed and the security posture thinking
* the test script happens to be a currency.xml, which made me think we were doing something specific to currency types, but instead any xml formatted file will suffice for the test.
* drop the ing, and be more specific on the name of the ref guide page
* use the same name everywhere
Co-authored-by: David Smiley <dsmiley@apache.org>
* When the schema defines _root_, and you want to do atomic/partial updates...
** _root_ needn't be stored or have docValues any more
** _nest_path_ field isn't needed for this any more
** Simplified internal logic
* Allow (and recommend, eventually insist) that the _root_ field be passed for atomic/partial updates to child docs.
** In the absence of _root_, assume the _route_ param is equivalent to ameliorate back-compat scope. This is a temporary hack; remove in SOLR-15064.
** One of the two is required; you'll get an exception if the assumption is false. THIS IS A BACK-COMPAT CHANGE
* Ensure that the update log contains the _root_ field if it's defined in the schema; in some cases it wasn't. It's important for robustness of atomic/partial updates to child docs. Caveat: the buffer replay scenario is not tested with child docs.
* Limited the cases when a realtime searcher is re-opened. It was being applied to any update that included child docs but now only some narrow subset: only for atomic/partial updates, and when the update log contains an in-place update for the same nest because it's complicated to resolve those log entries.
* Internal improvements to RealTimeGetComponent to aid clarity & robustness & probably performance...
** Use SolrDocumentFetcher.solrDoc(docID, ReturnFields) instead of more manual loading. Will do more with this in another PR.
** Clarify when only root doc IDs are expected.
** Use Resolution enum more, add PARTIAL, remove DOC_WITH_CHILDREN; enhance docs.
** When have ReturnFields, a Set of "onlyTheseFields" becomes redundant. Add a child doc resolution via a transformer when needed.
** Clarified where copy-field targets are removed
* NestPathField should default to single valued, instead of inheriting the schema default, which for ancient schemas was multi-valued.
* AddUpdateCommand.getLuceneDocument(s) methods are very internal; made package visible and refactored a bit for clarity
* DocumentBuilder: when in-place update, skip id and _root_ here, thus also simplifying further logic
* NestedShardedAtomicUpdateTest no longer extends AbstractFullDistribZkTestBase because it wasn't really leveraging the "control client" checking, and it added too much complexity to debug failures.
missing, allBuckets, and numBuckets is not supported with stream method.
So, avoiding picking stream method when any one of them is enabled even if
facet sort is 'index asc'