SOLR-4793: Document usage of ZooKeeper's jute.maxbuffer sysprop for increasing the file size limit above 1MB

This commit is contained in:
Steve Rowe 2018-04-20 16:06:22 -04:00
parent 76578cf17b
commit 22c4b9c36f
4 changed files with 77 additions and 1 deletions

View File

@ -77,6 +77,10 @@ import org.slf4j.LoggerFactory;
* <p>See the <a href="http://opennlp.apache.org/models.html">OpenNLP website</a>
* for information on downloading pre-trained models.</p>
*
* Note that in order to use model files larger than 1MB on SolrCloud,
* <a href="https://lucene.apache.org/solr/guide/setting-up-an-external-zookeeper-ensemble#increasing-zookeeper-s-1mb-file-size-limit"
* >ZooKeeper server and client configuration is required</a>.
*
* <p>
* The <code>source</code> field(s) can be configured as either:
* </p>

View File

@ -560,6 +560,8 @@ NOTE: No `"features"` are configured in `myWrapperModel` because the features of
CAUTION: `<lib dir="/path/to/models" regex=".*\.json" />` doesn't work as expected in this case, because `SolrResourceLoader` considers given resources as JAR if `<lib />` indicates files.
As an alternative to the above-described `DefaultWrapperModel`, it is possible to <<setting-up-an-external-zookeeper-ensemble#increasing-zookeeper-s-1mb-file-size-limit,increase ZooKeeper's file size limit>>.
=== Applying Changes
The feature store and the model store are both <<managed-resources.adoc#managed-resources,Managed Resources>>. Changes made to managed resources are not applied to the active Solr components until the Solr collection (or Solr core in single server mode) is reloaded.

View File

@ -349,6 +349,76 @@ set ZK_HOST=zk1:2181,zk2:2181,zk3:2181/solr
Now you will not have to enter the connection string when starting Solr.
== Increasing ZooKeeper's 1MB File Size Limit
ZooKeeper is designed to hold small files, on the order of kilobytes. By default, ZooKeeper's file size limit is 1MB. Attempting to write or read files larger than this will cause errors.
Some Solr features, e.g. text analysis synonyms, LTR, and OpenNLP named entity recognition, require configuration resources that can be larger than the default limit. ZooKeeper can be configured, via Java system property https://zookeeper.apache.org/doc/r{ivy-zookeeper-version}/zookeeperAdmin.html#Unsafe+Options[`jute.maxbuffer`], to increase this limit. Note that this configuration, which is required both for ZooKeeper server(s) and for all clients that connect to the server(s), must be the same everywhere it is specified.
=== Configuring jute.maxbuffer on ZooKeeper nodes
`jute.maxbuffer` must be configured on each external ZooKeeper node. This can be achieved in any of the following ways; note though that only the first option works on Windows:
. In `<ZOOKEEPER_HOME>/conf/zoo.cfg`, e.g. to increase the file size limit to one byte less than 10MB, add this line:
+
[source,properties]
jute.maxbuffer=0x9fffff
. In `<ZOOKEEPER_HOME>/conf/zookeeper-env.sh`, e.g. to increase the file size limit to 50MiB, add this line:
+
[source,properties]
JVMFLAGS="$JVMFLAGS -Djute.maxbuffer=50000000"
. In `<ZOOKEEPER_HOME>/bin/zkServer.sh`, add a `JVMFLAGS` environment variable assignment near the top of the script, e.g. to increase the file size limit to 5MiB:
+
[source,properties]
JVMFLAGS="$JVMFLAGS -Djute.maxbuffer=5000000"
=== Configuring jute.maxbuffer for ZooKeeper clients
The `bin/solr` script invokes Java programs that act as ZooKeeper clients. (When you use Solr's bundled ZooKeeper server instead of setting up an external ZooKeeper ensemble, the configuration described below will also configure the ZooKeeper server.)
Add the setting to the `SOLR_OPTS` environment variable in Solr's include file (`bin/solr.in.sh` or `solr.in.cmd`):
[.dynamic-tabs]
--
[example.tab-pane#linux2]
====
[.tab-label]*Linux: solr.in.sh*
The section to look for will start:
[source,properties]
----
# Anything you add to the SOLR_OPTS variable will be included in the java
# start command line as-is, in ADDITION to other options. If you specify the
# -a option on start script, those options will be appended as well. Examples:
----
Add the following line to increase the file size limit to 2MB:
[source,properties]
SOLR_OPTS="$SOLR_OPTS -Djute.maxbuffer=0x200000"
====
[example.tab-pane#zkwindows2]
====
[.tab-label]*Windows: solr.in.cmd*
The section to look for will start:
[source,bat]
----
REM Anything you add to the SOLR_OPTS variable will be included in the java
REM start command line as-is, in ADDITION to other options. If you specify the
REM -a option on start script, those options will be appended as well. Examples:
----
Add the following line to increase the file size limit to 2MB:
[source,bat]
set SOLR_OPTS=%SOLR_OPTS% -Djute.maxbuffer=0x200000
====
--
== Securing the ZooKeeper Connection
You may also want to secure the communication between ZooKeeper and Solr.

View File

@ -355,7 +355,7 @@ The {solr-javadocs}/solr-uima/index.html[`uima`] contrib provides::
The {solr-javadocs}/solr-analysis-extras/index.html[`analysis-extras`] contrib provides::
{solr-javadocs}/solr-analysis-extras/org/apache/solr/update/processor/OpenNLPExtractNamedEntitiesUpdateProcessorFactory.html[OpenNLPExtractNamedEntitiesUpdateProcessorFactory]::: Update document(s) to be indexed with named entities extracted using an OpenNLP NER model.
{solr-javadocs}/solr-analysis-extras/org/apache/solr/update/processor/OpenNLPExtractNamedEntitiesUpdateProcessorFactory.html[OpenNLPExtractNamedEntitiesUpdateProcessorFactory]::: Update document(s) to be indexed with named entities extracted using an OpenNLP NER model. Note that in order to use model files larger than 1MB on SolrCloud, <<setting-up-an-external-zookeeper-ensemble#increasing-zookeeper-s-1mb-file-size-limit,ZooKeeper server and client configuration is required>>.
=== Update Processor Factories You Should _Not_ Modify or Remove