SOLR-13662: Improvements for Ref Guide package-manager.adoc

This commit is contained in:
Cassandra Targett 2019-12-18 09:28:16 -06:00
parent 56839f6ace
commit fc2fbb2f7e
1 changed files with 49 additions and 40 deletions

View File

@ -1,5 +1,6 @@
= Package Management = Package Management
:page-children: package-manager-internals :page-children: package-manager-internals
:page-tocclass: right
// Licensed to the Apache Software Foundation (ASF) under one // Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file // or more contributor license agreements. See the NOTICE file
@ -18,52 +19,53 @@
// specific language governing permissions and limitations // specific language governing permissions and limitations
// under the License. // under the License.
== Glossary of Terms The package manager in Solr allows installation and update of Solr-specific packages in distributed and standalone environments.
=== Package In this system, a _package_ is a set of Java jar files (usually one) containing one or more <<solr-plugins.adoc#solr-plugins,Solr plugins>>. Each jar file is also accompanied by a signature string (which can be verified against a supplied public key).
A set of jar files (usually one) containing one or more <<solr-plugins.adoc#solr-plugins,Solr plugins>>. Each jar file is also accompanied by a signature string (which can be verified against a supplied public key).
=== Repository A key design aspect of this system is the ability to install or update packages in a cluster environment securely without the need to restart every node.
A location hosting one or many packages. Usually, this is a web service that serves meta information about packages as well as serves the package artifacts for downloading.
== Overview Other elements of the design include the ability to install from a remote repository; package standardization; a command line interface (CLI); and a package store.
The package manager in Solr consists of the following internal components:
* Package Manager CLI This section will focus on how to use the package manager to install and update plugins.
* Package Manager internal APIs For technical details, see the section <<package-manager-internals.adoc#package-manager-internals,Package Manager internals>>.
* Isolated classloaders
* Package Store
In this guide, we will focus on the Package Manager CLI, which essentially uses the other APIs and components internally. For details on the other components (and hence details of inner workings of the package manager), please refer to <<package-manager-internals.adoc#package-manager-internals,Package Manager internals>>.
== Interacting with the Package Manager == Interacting with the Package Manager
Essentially, these are the phases in using the package manager: The package manager CLI includes allows you to:
* Starting Solr with support for package management * Start Solr with support for package management
* Adding trusted repositories * Add trusted repositories
* Listing and installing packages * List packages at a repository
* Deploying packages on to collections * Install desired packages
* Updating packages * Deploy packages to collections
* Update packages when updates are available
=== Starting Solr with Package Management Support === Enable the Package Manager
Start all Solr nodes with the `-Denable.packages=true` parameter. There are security consequences in doing so. At a minimum, no unauthorized user should have write access to ZooKeeper instances, since it would then be possible to install packages from untrusted sources (e.g. malicious repositories). The package manager is disabled by default. To enable it, start all Solr nodes with the `-Denable.packages=true` parameter.
[source,bash] [source,bash]
---- ----
$ bin/solr -c -Denable.packages=true $ bin/solr -c -Denable.packages=true
---- ----
=== Adding Trusted Repositories WARNING: There are security consequences to enabling the package manager.
If an unauthorized user gained access to the system, they would have write access to ZooKeeper and could install packages from untrusted sources. Always ensure you have secured Solr with firewalls and <<authentication-and-authorization-plugins.adoc#authentication-and-authorization-plugins,authentication>> before enabling the package manager.
In order to install packages into Solr, one has to add a repository hosting the packages. A repository is essentially a web service hosting package artifacts (jar files) and a public key (to validate the signatures of the jar files while installing). Note: Please do not add repositories that you don't trust or control. Also, only add repositories that are based on https and avoid repositories based on http to safeguard against MITM attacks. === Add Trusted Repositories
A _repository_ is a a location hosting one or many packages. Often this is a web service that serves meta-information about packages, the package artifacts for downloading, and a public key to validate the jar file signatures while installing.
In order to install packages into Solr, one has to add a repository hosting the packages.
[source,bash] [source,bash]
---- ----
$ bin/solr package add-repo <name-of-repo> <repo-url> $ bin/solr package add-repo <name-of-repo> <repo-url>
---- ----
NOTE: Do not add repositories that you don't trust or control. Only add repositories that are based on HTTPS and avoid repositories based on HTTP to safeguard against MITM attacks.
=== Listing and Installing Packages === Listing and Installing Packages
To list installed packages: To list installed packages:
@ -73,7 +75,6 @@ To list installed packages:
$ bin/solr package list-installed $ bin/solr package list-installed
---- ----
To list packages available for installation from added repositories: To list packages available for installation from added repositories:
[source,bash] [source,bash]
@ -88,31 +89,38 @@ To install a package:
$ bin/solr package install <package-name>[:<version>] $ bin/solr package install <package-name>[:<version>]
---- ----
=== Deploying a Package to a Collection === Deploy a Package
Once a package has been installed, the plugins contained in it can be used in a collection, using either of the two methods: Once a package has been installed, the plugins contained in it can be used in a collection.
==== Deploying using deploy Command There are two ways to do this: either use the CLI's `deploy` command or manually.
This can be done using the package manager's `deploy` command, provided the package supports it (package author's documentation would usually mention that):
==== deploy Command
If the package author states support for it, the package can be deployed with the CLI's `deploy` command:
[source,bash] [source,bash]
---- ----
$ bin/solr package deploy <package-name>:[version] -collections <collection1>[,<collection2>,...] $ bin/solr package deploy <package-name>:[version] -collections <collection1>[,<collection2>,...]
---- ----
This may prompt you to execute a command to deploy the package. If you pass `-y` to the command, then this prompt can be skipped. The author may want you to confirm deployment of a package via a prompt.
If you pass `-y` to the command, confirmation can be skipped.
==== Deploying Manually ==== Manual Deploy
Alternatively, it is also possible manually edit a configset (solrconfig.xml, managedschema / schema.xml etc.) and using it by RELOADing a collection.
Example: Add a request handler from the package `mypackage` to a configset's solrconfig.xml: It is also possible to deploy a package manually by editing a configset (e.g., `solrconfig.xml`, `managed-schema`/`schema.xml`, etc.) and reloading the collection.
For example, if a package named `mypackage` contains a request handler, we would add it to a configset's `solrconfig.xml` like this:
[source, xml] [source, xml]
---- ----
<requestHandler name="/myhandler" class="mypackage:full.path.to.MyClass"></requestHandler> <requestHandler name="/myhandler" class="mypackage:full.path.to.MyClass"></requestHandler>
---- ----
After that, `RELOAD` your collection. Now, you should set the package version that this collection is using, as follows (say collection is called `collection1` and package name is `mypackage` and installed version is `1.0.0`): Then use either the Collections API <<collection-management.adoc#reload,RELOAD command>> or the <<collections-core-admin.adoc#collections-core-admin,Admin UI>> to reload the collection.
Next set the package version that this collection is using. If the collection is named `collection1`, the package name is `mypackage`, and the installed version is `1.0.0`, the command would look like this:
[source,bash] [source,bash]
---- ----
@ -120,26 +128,26 @@ curl "http://localhost:8983/api/collections/collection1/config/params" \
-H 'Content-type:application/json' -d "{set: {PKG_VERSIONS: {mypackage: '1.0.0'}}}" -H 'Content-type:application/json' -d "{set: {PKG_VERSIONS: {mypackage: '1.0.0'}}}"
---- ----
==== Verifying the Deployment ==== Verify the Deployment
After deploying, verify that the collection is using the package: After deploying, verify that the collection is using the package:
[source,bash] [source,bash]
---- ----
$ bin/solr package list-deployed -c <collection> $ bin/solr package list-deployed -c <collection>
---- ----
=== Updating Packages === Updating Packages
In order to update a package, first step is make sure the updated version is available in the added repositories by running `list-available` command. Next, install the new version of the package from the repositories. In order to update a package, first step is make sure the updated version is available in the added repositories by running `list-available` command shown above in <<Listing and Installing Packages>>.
Next, install the new version of the package from the repositories.
[source,bash] [source,bash]
---- ----
$ bin/solr package install <package-name>:<version> $ bin/solr package install <package-name>:<version>
---- ----
Now, you can selectively update each of your collections using the old version (say, `1.0.0`) of the package (say, `mypackage`) to the newly added version (say `2.0.0`) as follows: Once you have installed the new version, you can selectively update each of your collections. Assuming the old version is `1.0.0` of the package `mypackage`, and the new version is `2.0.0`, the command would be as follows:
[source,bash] [source,bash]
---- ----
@ -149,6 +157,7 @@ $ bin/solr package deploy mypackage:2.0.0 --update -collections mycollection
You can run the `list-deployed` command to verify that this collection is using the newly added version. You can run the `list-deployed` command to verify that this collection is using the newly added version.
== Security == Security
Except the `add-repo` step, all other steps can be executed using a HTTP endpoint in Solr (see <<package-manager-internals.adoc#package-manager-internals,Package Manager internals>>). This step registers the public key of the trusted repository, and hence can only be executed using the package manager (CLI) having direct write access to ZooKeeper. Hence, as you can imagine, it is important to protect ZooKeeper from unauthorized write access.
Also, keep in mind, that it is possible to install any package from a trusted and an already added repository. Hence, if you want to use some packages in production, then it is better to setup your own repository and add that to Solr, instead of adding a generic third-party repository that is beyond your administrative control. As noted above in the section <<Add Trusted Repositories>>, the `add-repo` step should only be executed using an HTTPS endpoint in Solr (all other steps can be executed using HTTP - see also <<package-manager-internals.adoc#package-manager-internals,Package Manager Internals>>). This step registers the public key of the trusted repository, and hence can only be executed using the package manager (CLI) having direct write access to ZooKeeper. It is critical to protect ZooKeeper from unauthorized write access.
Also, keep in mind, that it is possible to install *any* package from a repository once it has been added. If you want to use some packages in production, a best practice is to setup your own repository and add that to Solr instead of adding a generic third-party repository that is beyond your administrative control.